Why Wikipedia matters, and how to make sense of it (programmatically)

2022-06-23

Some time ago, I wrote a Twitter thread about one of the unseen hard problems in software development—access to the common knowledge.

Since then, a few things happened to me, one of the most important being the inception of the new project trying to attack those problems, named WikipediaQL (that had already attracted some positive attention even in the early stages it is). I am still working on that project and plan a series of articles on the problems of common sense knowledge extraction and practical approaches to it.

As a prelude for this series and linkable justification of various aspects of my work, the current article is the (more orderly) republishing of the Twitter thread above.

Some of the hardest problems to bring in the software development ecosystem are those that “intuitively” have an easy answer. (They are hard because before starting to discuss possible solutions, you need to persuade people it is something non-trivial and worth thinking of. And after they understand, they become sad and don’t want to think about it either. I saw it discussing the spellcheckers.)

Anyways, about the common-sense data/knowledge. How many people live in Albania? What’s the title of Game of Thrones S05E07? What books had Tove Jansson written? When was Google incepted?

