January 16 to 31 (#2 of 2024)

2024-02-01

👋👋 Hi, I'm Domingo!

Second fortnight of the year, January 16 to 31, and the second issue of the newsletter. Here we are again, talking about things that have interested me over these last fifteen days.

Thank you very much for reading me!

🗞 News

1️⃣ We begin once again with DeepMind. On January 17 they published in Nature the article AlphaGeometry: An Olympiad-level AI system for geometry, in which they present a language model that has been taught to solve geometry problems. The model achieves a success rate similar to that of the best humans and far surpasses the best existing algorithms, which are based on symbolic models that carry out automatic theorem proving.

The model is built from 100 million automatically generated symbolic expressions representing correct geometric relations. From that data the model is able to generalize and generate constructions, in the words of its authors, pulling rabbits out of hats, that help a symbolic engine prove the problem. The symbolic engine then completes the solution using the hint added by the language model.

Although at first Hassabis applauded the advance by saying on X that it was a "step toward AGI", he later toned down his enthusiasm and deleted that phrase, leaving a more neutral tweet.

It is not clear how this work can be used to improve current LLMs. The domain to which it is applied is very restricted: geometry problems. And the problems have to be formulated in a specific mathematical language, so it is the human who must translate the geometry problem into that language.

What we are verifying once again, last fortnight it was chess, this time it is geometry, is that the LLM architecture can learn and generalize from almost anything, if we have a dataset that is large enough.

One of the things being investigated in order to get closer to AGI is making LLMs capable of working with plans: learning to generate them, analyze them, execute them, and modify them.

I am sure there are already people building LLMs trained not on the final programs that exist on GitHub, but on the whole history of changes that led to the construction of those programs, the history of commits, which is also available on GitHub. As Karpathy says:

The ideal training data for an LLM is not what you wrote. It's the full sequence of your internal thoughts and all the individual edits while you wrote it.

If you want to take a look at how the system works, here is the explanatory video by its authors.

2️⃣ Last fortnight I forgot to mention Rabbit's curious gadget, the R1.

It is an interface with a camera, a microphone, and a touch screen. It is a charming little agent with which you can interact in natural language so that an LLM executes your requests on a computer in the cloud. What is interesting is that the LLM has been trained on screenshots and human actions, and is able to navigate applications and web pages and interact with them. In the demo that Rabbit's CEO presented at CES, which has more than 5 million views, you can see the device being used to order a pizza or book a flight.

The device brings to mind science-fiction computational devices such as those in Star Trek or Her. Will this be the new way of interacting with computers? Will it be easier to talk to a computer than to use user interfaces, apps, and your finger? Perhaps for some things yes, but in general I do not think so. I do not think traditional user interfaces are going to disappear. I completely agree with everything said in the episode of the Techmeme Ride Home podcast featuring John Gruber, Chris Messina, and Brian McCullough. Highly recommended.

3️⃣ Sam Altman has been in Davos, and we have heard him in several public events. Of what I have heard, what seemed most interesting to me was this interview in The Economist together with Satya Nadella, available to subscribers. A summary and commentary on the interview can be heard in the episode of The Economist's Babbage podcast.

Some of Altman's lines in the interview:

The model [that powers ChatGPT] is going to get smarter and smarter, more capable [...]. Reasoning is one of the capabilities in which the model will improve [...]. It will improve in general. That is one of the features of these models, that they improve in general capabilities, and that leads to improvements in specific features, such as writing better code.

Suppose GPT-4 is capable of doing 10% of a human being's tasks. How is GPT-5 going to improve? Will it be able to do 12%, 15%, 20%? That is the right way to measure improvement.

We're going to invent AGI sooner than most people think.

Every year we will put into circulation a better model than the model from the year before. If you put an iPhone 1 next to an iPhone 15, you realize the enormous difference between them, how bad the first one was, even though it was a revolution. And no one complained along the way asking for a better iPhone. Something like that is going to happen with models. GPT-2 was horrible, GPT-3 was pretty bad, GPT-4 is bad, GPT-5 will be okay [and so on until AGI].

4️⃣ Two very important regional changes in the Apple App Store: links to external purchases, in the US, and the introduction of alternative app stores, in the EU. These are the first relevant changes in Apple's application platform in almost 15 years, since in-app purchases were introduced in 2009.

The first change is already in force for the US App Store and is a consequence of the final ruling in the Epic trial. The ruling forces Apple to allow apps to include a link taking users to a website where an external purchase can be made, independently of in-app purchases.

Apple has complied with the ruling by updating its APIs and introducing a StoreKit External Purchase Link, but it is keeping its tax at 27%, 12% for small developers, on purchases made in this way. I found it very curious how this greatly angered some American developers, who felt disappointed and betrayed by the revenue-hungry attitude of the giant apple company. I had never heard the good Casey Liss so angry. He let off steam quite thoroughly, together with Marco Arment, in the episode of Accidental Tech Podcast in which they discuss the matter. As always, John Siracusa provided the necessary analytical and rational touch.

I do not see it as such a big deal. These are businesses. I have always been very compliant with taxes, I do not complain about them. I wish I paid more, that would mean I earn more. When I make €10,000 with an app, if that ever happens, I will not mind giving €3,000 to Apple. Those are the rules of the game. They provide the platform, the APIs, the development tools, and I do not think it is wrong for them to take their percentage. Video game companies are far worse, and nobody complains.

The second change is much deeper. On March 7 Apple must comply with the EU's Digital Markets Act, DMA. And it has already announced all the changes coming to Europe, which it will launch with the next iOS 17.4 update.

There are more than 600 API changes in iOS, on which Apple's developers have been working for more than a year. Jason Snell and Mike Hurley discuss them in great detail in this episode of Upgrade. Javier Lacort also gives a very good summary in this episode of Loop Infinito.

The most important point will be the possibility of using alternative app stores, Apple calls them alternative app marketplaces, in which developers can distribute apps without needing to follow Apple's content guidelines or pay Apple's commission. All distributed apps will still have to be inspected and approved by Apple, in a process called notarization, to verify that they do not contain malware, that they comply with certain privacy standards, and that they do not pose a risk to the iPhone or the operating system.

These alternative stores will also face a tax, for the use of the intellectual property of the iOS platform. Developers will have to pay Apple €0.50 per annual active user above one million installs. It is still not known whether this will be accepted by the European authorities.

This fee will not be a problem for small developers, who will not reach one million installs and therefore will pay nothing, nor for large developers with a solid business model, since any reasonably profitable business earns much more than €0.50 per user per year. Epic has already announced that it will create its own store. But other companies such as Spotify have already complained to the EU and are doing the math to see whether it will be profitable for them.

There are many more changes, but we do not have time to describe them here. You can listen to the podcasts I mentioned earlier, or read the article by Jason Snell or the one by John Gruber.

5️⃣ The Vision Pro is finally here! Tomorrow itself, February 2, it will already be in stores. And yesterday the embargo on the first headset reviews was lifted. For example, the one by John Gruber, the one by The Verge, in the video below, or many others gathered in MacStories' roundup.

As we said in the previous newsletter, everyone talks about how spectacular it is to watch films as if you were in a cinema, at last 3D films can be watched properly, about Disney environments, and about concerts and shows in which it feels as if you are right next to the stage. For my part, I am eager for developers to start building cool interactive things, such as this app by an independent developer that Apple has highlighted, of which for now we only have one picture.

I would also like Apple to explore new forms of interaction. For now, as Siracusa says very well in his article Spatial Computing, Apple uses an indirect interaction model in Vision Pro. The eyes act as the pointer, and the hands, hidden from sight, make the gestures to grab, move, or resize things.

Why can you not point to and grab virtual objects directly on the Vision Pro? Perhaps Apple does not want defects like the one that appears in a moment of Joanna Stern's report for the WSJ, where she is cooking and places a timer over one of the pots. In several frames of that sequence we can see that the headset does not correctly calculate the depth of her hand and the jar she is holding, and the image becomes mixed with the timer.

We will have to wait for more advanced future versions before we can play interactive virtual tabletop games. It even seems that, for now, you cannot do something as apparently simple as share the same cinema app between two people who are in the same room wearing the headset.

For now it is an experience that is too solitary. Let us wait for future versions, more advanced and more affordable ones.

👷‍♂️ My fifteen days

🧑‍💻 My personal website is stalled. Sadly, there is nothing to report about it. Next fortnight I really do have to give it a push.

So as not to leave this section empty, I will mention two tools I use every day. Both are paid, but the price is completely worth it to me because of how much I use them.

Unread, on iPhone: an excellent RSS reader where I keep all the blogs, publications, and so on that I read every day. It is very easy to add an RSS feed. When you are on a page in Safari on the iPhone, you tap "share", select Unread, and the app detects the RSS URL so you can subscribe to it.
Things 3, on Mac: for me the best app for managing pending task lists. I use it on the Mac in a very simple way. I have a single project, which I call "Tasks", where I keep all pending tasks. And when I add a new task I always assign it a date. Either it is "Today", if it has to be done right away, or a specific date in the future, so that it appears on the "Today" screen when that day arrives. And this "Today" screen, where the tasks I have to do today appear, is the one I mainly use to tick things off as I finish them.

The app has many more features: tags, filters, multiple projects, and subprojects. Over the years I have used it I have tried all those things, but in the end I have stayed with the simplest setup. Once my website is up and running, I may write a post explaining all of this in detail.

📺 One more highly recommended series we watched this fortnight: The Other Side. Created by Berto Romero and directed by Javier Ruiz Caldera and Alberto de Toro. It is a series in which Berto leaves comedy aside to enter mystery and the supernatural. The performances, the characters, the story, and the atmosphere are all excellent. Those old Nueva Dimensión magazines from the 1980s are wonderful, I have them too, as are those VHS tapes with recordings of the TV program in which Buenafuente is basically a complete Jiménez del Oso.

I had not seen Modelo 77, but I have now corrected that mistake. It also has a spectacular atmosphere. In 1977 I was 13 years old, and I think I remember seeing on the news the inmates of La Modelo prison up on the rooftops. The film shows perfectly the state of the country at that time: labor lawyers, ordinary people, the excitement of the change that was arriving. Opposite them, prison officials and judges from the old regime. And in the middle, the prisoners. Excellent.

📖 The book I am reading is Blindsight, by Peter Watts. I am halfway through and it has everything I like: aliens, spaceships, thought experiments, dystopia. The story hooks you, it has many very interesting elements, and I am enjoying it a great deal. The only thing that is a bit uphill for me is the author's cyberpunk style. But you get used to it in the end.

And that is all for this fortnight. See you soon! 👋👋