Friday, March 23, 2007
Is Wikipedia approaching a barrier?
As an example, consider all possible Wikipedia articles about Ethiopia. The first few possible subjects are easy to name: an article about the nation, its history, economy, geography, etc. -- all of the topics that the average encyclopedia article would have. The next step is a little harder, but still straightforward & easily done -- series of articles on related things, like its rulers or heads of states, the historic provinces or the current subdivisions, or major battles. (Warfare seems to be a perennial favorite topic, second only to pop culture topics.) Another avenue is mining online sources for further topics: translation from one electronic format to another is always faster than translation from print to electronic. Yet eventually all of the low-hanging fruit gets picked, and a would-be contributor finds it easier to improve existing articles than create new ones.
There is also the dynamic that some articles -- either stubs or articles of marginal importance -- are merged into a single article. Consider the Patriarchs of Alexandria: of the first 12 office holders, only Mark the Evangelist and Demetrius are little more than names for even the most informed specialist. One of my long-procrastinated projects is to combine the entries of ten of these ancient religious leaders into a single article, with the little information we possess about them; when this is done, 10 articles will effectively become one -- effectively decreasing this statistic.
That is why I found Sage Ross' analysis important: where I have been guessing, he did the necessary number-crunching to prove that this limit is already approaching. He writes:
Another side to the watershed, which nobody is quite recognizing yet, relates to the limits of Wikipedia. The exponential phase of (English) Wikipedia's growth (in terms of number of articles, and in terms of number of active users) is probably over. From 2003 to mid-2006, the number of articles had followed a very regular exponential pattern. Had exponential growth continued, it would have hit 2,000,000 a few weeks ago; it just passed 1,700,000 today. The average number of articles created per day since late December (around 1724) has actually been lower than the average number per day over the previous year (1823). This difference is only partly the result of the always slower holiday season.
Sage's conclusion is identical to mine: "It seems that available unwritten encyclopedic topics is becoming a significant constraint."
If we are correct, the principle of least work -- the easiest tasks will almost always be completed first -- would predict that the quality of Wikipedia's articles will start to gradually improve, because that is becoming the easiest task on Wikipedia to do now. Even if this first takes the form of automated edits -- running bots to make large numbers of repetative changes. Eventually, someone will have to acknowledge the countless requests for sources that dot so many Wikipedia articles, and begin the long, tedious task of researching the issue and meeting that demand. It will be interesting to observe Wikipedia's reputation in schools and the mass media once that effort has made notable progress.
Technocrati tags: wikipedia.
Whenever I see someone say or think that perhaps Wikipedia is somewhere near completion, I go and re-read http://en.wikipedia.org/wiki/User:Piotrus/Wikipedia_interwiki_and_specialized_knowledge_test
As for the "AfD/CSD 'notability' barrier", that may be another factor. As Wikipedia's coverage reaches further into esoteric subjects, it becomes more important for a Wikipedian to explain why the subject is important in the beginning of the article. We can more or less intuitively accept why a head of state is notable; why a given programmer or porn star is notable is not always obvious.
However, translation is one of those barriers that keeps sources like the Polish or German Wikipedia from being considered "low-lying fruit". Another is the problem of writing general articles: it takes an informed mind to write a good article on a general subject that covers it adequately. This is the reason people often mention when it is pointed out that our best articles are specialized, esoteric ones -- not the general ones.
Links to this post: