Thursday, November 30, 2006

 

Nothing on the Internet is ever new

Even threaded conversations, the stand-out feature of email and Usenet, existed before them. Or so this account of the 150-year-old periodical Notes and Queries states.

Worth keeping in mind when people praise Wikipedia as original and innovative.

Wednesday, November 29, 2006

 

More on the number of Wikipedians

Thanks to Erik Zachte (with whom I shared a crowded booth in a Pizza shop at Wikimania last summer), we once again have the Wikipedia statistics needed to answer the question I posed in my previous post, "Can Wikipedia keep growing?" So just how many dedicated Wikipedians are there?

According to Erik's work, my guestimate appears to be too low. In the latest reported period (June, 2006) there were 4250 people who made at least 100 edits a month. Then again, this works out to an average of a little more than 3 edits a day, and that is still not enough to reliably gain attention from other dedicated Wikipedians. My impression, which I've seen often repeated in the discussions over nominations for Administrator, is that 200 edits a month should be the minimum. So this number of dedicated Wikipedians -- the core of the community if you will -- is even smaller.

And remember, even if we accept the 4250 figure, this does not mean these same people are at work every day and every month: there is a lot of turnover amongst Wikipedians. People come, people go, and often serious contributors find that they are kept away from working on the project for other reasons. One reason I've seen mentioned frequently is the pressure of schoolwork, either high school or college. But family, work, and travel are other common reasons, and some days an editor just suffers from writer's block or the need for a short, unofficial break; we're all volunteers, after all, and no one has the right to complain if we stop working. (Which makes the complaints about our work all the more baffling to me.)

It appears that the best I can do is say that the size of this group is "in the thousands". Any numbers we come up with, if honestly used, will only be a starting point for more or less informed guesses.

Geoff

Labels:


Friday, November 24, 2006

 

Something unrelated to computers that was user unfriendly

Saw the Ancient Egyptian exhibit at the Portland Art Museum today. The pieces were undeniably wonderful, but afterwards, my chief impression was how the exhibit was defeated by its layout. And the problems, like any interface, could have been addressed with a few simple fixes.

First was the problem of how to get from the initial exhibit -- the monumental head of Senusret I that had been recarved to portray Ramesses II. I and my wife examined it, wondered behind it to look at the lid of an Egyptian coffin -- then puzzled over where to go to see the rest of the exhibit.After some fumbling around, we found that we were expected to know that only these two items were on the ground floor & that we had to go upstairs to view the rest of the artifacts.

(Actually, this was my wife's complaint. Trust me she's no dummy, having passed the CPA exam on the first try. More importantly, once she became annoyed with this exhibit, I couldn't help but become annoyed with her.)

Once up there, it was difficult to view anything in the first room due to the crush of the crowd, formed from the visitors who were -- as could be expected -- eager to see the treasures. And then there was a fragment of an obelisk I was looking forward to seeing -- but it had been placed on the far side from the door of a small room where the usual "Egyptian funerary beliefs for beginners" video was playing. To examine the obelisk meant I had to stand in the middle of the room & block the view of several people, & I wasn't interested in watching a video that repeated a lot of things that I already knew.

Probably what stands out most in my mind was a room created in imitation of a similar room in the tomb of Thutmose III. Someone had put a lot of thought & work into the effort, & if one listened to the audio guide for this part of the exhibit, it was quite informative. However, when I first entered the room, I had no idea that the audio guide was so crucial: I walked into the darkened room, avoided walking into a few more absorbed groups of visitors, glanced at the 3 or 4 objects in plexiglass cases, noticed that the floor of the room was covered with worn 2x10 planking, & wondered what the point was at having a reproduction of the Egyptian Book of the Dead papered over the walls. At this point I walked back to the entrance, set my listening device to this room, & found out what I had been missing -- all of which could have been made much easier for people like me had there been a sign above the entrance saying something like, "A reproduction of a chamber from the Tomb of Thutmose III".

Usually I ignore these devices, because they usually offer nothing more than a re-hashing of what I've already read 20 years ago, and fail to answer the questions a nerd like me often has. For example, at one point the voice of Jeremy Irons intoned that if the devoted family forgot to visit the tomb and make an offering to the dead, the spirit could be nourished merely by looking at one of the pictures. My question about this statement is simple: do the ancient Egyptians actually tell us this is what we believe, or is something one of the scholars suggested, & has since been accepted by fact by generations of Egyptologists and popularizers?

However, let me emphasise that I did enjoy seeing these actual ancient objects up close, despite what the above might suggest. Seeing the actual object versuses looking at even a high-quality photograph allows one a true intimacy. Being intimate means more than seeing something in the equivalent of the Sunday Best and perfect manners: it also means seeing what is worn when there hasn't been enough time to do laundry. There was a model boat from the tomb of Amenhotep II, and I spent my time studying the parts that do not intentionally appear in the pictures, like the backs of the objects on the deck (where the catalog numbers and the grain of the wood could be seen) or the bow and stern of the boat and marvelling at the wear on the piece.

Geoff

Wednesday, November 22, 2006

 

The Quality Push, part II

The ''Wikipedia Signpost'' published the results of another writing contest. The ''Signpost'' writer noted that the three top articles were all affiliated with the Military history WikiProject, and praised the efforts of Wikipedian Kirill Lokshin.

Some people are too useful to spend their time on the Arbitration Committee, which has the thankless chore of being the final referee of disputes and incivil behavior.

Geoff

Saturday, November 18, 2006

 

Writing articles, part II

I've been writing articles on the woredas -- that's Amharic for "district" -- of Ethiopia. There's roughly 520 of them worth an article (the larger cities like Addis Ababa & Dire Dawa are split into woredas, but in their case these woredas are like administrative wards), & I've put up around 400 of them. But now I'm stalled.

Part of the problem is that for the remaining number, I have to figure out how to integrate a new source of information: before its people forgot to pay for their domain name fees, the Oromia Region had their own website, with an extensive amount of useful data -- including economic breakdowns of about half of their woredas. (I'm not sure why only half of the possible material was posted, but it appears as if someone made a mistake, & no one else bothered to check the work. And there was no contact information for the website -- so I had to simply be happy with what I got.) Once I figure out how to chop & sort this abundance of information, I'll continue with this series of articles.

But a large part was that I was growing bored with this task. I haven't stopped for good, but contributing a large number of stubs, many of which will never grow into complete articles without even more work from me, can be discouraging. Add to that these distractions:



Since the first few days I started contributing to Wikipedia, the choice has always been whether to greatly improve a small number of articles or to make smaller amount of improvements to a larger number of articles. I just need to remember, the average quality of Wikipedia improves -- as long as I do one or the other. And not posts in the Talk or Wikipedia namespaces about what should be done to improve Wikipedia

Geoff

Thursday, November 16, 2006

 

Another interesting link

In "Researching with Wikipedia", Eiffel links to an article started by Jmabel Wikipedia:Researching with Wikipedia. That grew out of a discussion at
the First Seattle WikiMeetup.

Of related interest at that website is an earlier post, also by eiffel, Is Wikipedia a legitimate research source?.

Geoff

 

Refining original research

If you browse the theoretical pages of Wikipedia, you will find a number of threads on the issue of "original research" -- contributing to Wikipedia conclusions novel interpretations of existing evidence or new "discoveries". While the intent of this restriction was, at first, merely to keep the work of cranks who clutter the archives of the many Usenet sci.* groups off of Wikipedia, it has since been extended to forbid adding many different conclusions -- even those that are undeniably obvious. One recent example was one editor who stubbornly insisted that in V for Vendeta (film), how the section "The letter V and the number 5" is entirely "original research".

If one wants to play the part of a fool, as Thersites did in the second book of The Iliad, then others should be allowed to respond as Homer reports Odysseus had.

What has been overlooked in this expansion of the rule against Original Research is that many kooks and cranks want to publish their "new and important findings" in order to stop further debate. They are not attempting to present one more opinion, one more point of view -- which under the guidelines of Neutral point of View is permitted -- but to exclude all other points of view and replace them with only one -- their own. I doubt anyone would believe this is healthy.

I find the following, taken from the American Library Association's ideal professional standards for publishing original research, useful:

3.5.2 Scholarship:
  • Academic careers exist to make distincitons and to open up spaces of difference in order to produce new knowledge through experiment, speculation, and interpretaton or through study and
    commentary on or revision of work done in the past.

  • Scholarship is highly cumulative and iterative: it tends to resist closure. Academic books thus participate in and stir critique and controversy even as they pretend to settle the matter once and for all.

  • Although the range of subjects and variety of tones used in academic writing has expanded
    considerably of late, academic books tend to be critical rather than promotional and provide
    alternative views and counter arguments. They test and probe even as they assert and celebrate.


I doubt that Wikipedia will reach the point where we can enforce these standard; it's hard enough making sure contributions follow the rules of grammatical english. Yet I believe this is something to keep in mind in our ongoing quest to improve the quality of Wikipedia.

Geoff

Wednesday, November 15, 2006

 

A philosophical meditation

I wrote the following paragraph as part of an application I turned in yesterday. I'm sharing it here because I felt it should be read before vanishing into a file somewhere, never to be by more than a few sets of eyes.

One hobby I have continued from my earlier years is writing, but now I do for Wikipedia, a free encyclopedia available on the Internet. While writing for it, I admit that I am writing for myself, I am also consciously writing for a specific kind of child out there who I imagine is a precocious reader of books as I was, but they are also share the frustration over a lack of access to the important, grown-up books that I had at their age -- or perhaps they do not have access even to many of the books I had at their age for various reasons. Schools, well-meaning adults, and even parents sometimes fail to reach and help precocious children, so resources like public libraries and Wikipedia must be present and strong enough to help them help themselves.


Geoff

Monday, November 13, 2006

 

Can Wikipedia keep growing?

As I write this, the Main Page of Wikipedia states that it has 1,482,227 articles. (It's quite safe to assume that the actual number was at that moment a little higher -- and is even higher the moment you read this.) Over a million articles! Is it possible for this number to climb even higher -- say to two million and above?

If you poke around a little, you will find pages like Wikipedia: Size Comparisons, where it is shown that Wikipedia has 40% more articles than the largest print encyclopedia, the Spanish Enciclopedia universal ilustrada europeo-americana. Then again, if you poke around a little, you may wonder just how many articles are not about popular culture subjects like Lord of the Rings, Anime, video games, and The Simpsons television show.

And yet, the number of articles have been doubling on a fairly regular basis: according to Wikipedia: Modelling Wikipedia's growth, based on prior growth the size doubles once every 354 days. And if you think about it for a moment, there are still thousands -- if not tens of thousands -- of towns and villages outside the United States which lack articles, and each of these is on average connected to at least one person or event that is arguably notable enough for inclusion. That is only one possible case, and I am sure any reader can come up with her or his own. Looked at this way, it would appear that Wikipedia has barely scratched the surface of all of the possible -- and justifiable -- topics.

However, I suspect that we are within sight of the practical limit of articles for Wikipedia. To put it in quantifiable terms, I expect that at some point between five and ten million articles, the number of new articles will dramaticaly fall off. There are a number of reasons for this:

The number of Wikipedians are finite. Something that occasionally gets overlooked is the simple fact that contributing to Wikipedia is a rather odd hobby. Further, writing for Wikipedia is not a way to gain fame or fortune because of its emphasis on group ownership of articles -- which leads to almost all articles having an anonymous personality. While this quality has a number of justifiably positive benefits, it also leads to inevitable frustration
from editors, who just as arguably feel that they do not receive enough recognition for their hard work -- which is a topic I will discuss in a later post.

And it takes people to write, revise, and revert the inevitable vandalism on articles. A person can only do so much, regardless how addicted she or he might be to Wikipedia. Once the number of articles Wikipedia has reached the practical limit our active members can handle, I wouldn't be surprised if a consensus emerges to stop accepting any new articles. That's not something I'd like to see, but if it needs to happen so we can keep the project going forward, it will happen.

So just how many potential Wikipedians are out there? If we could find a way to determine just how many people use Wikipedia, then compare that number to the number of accounts on Wikipedia (2,740,857 total), then compare that last number to the number of active editors (this information appears to be no longer available, but my seat-of-pants guess is that it is roughly equal to the number of Admins -- 1000), we might be able to extrapolate an answer. But if you
consider that 2740 people have created an account on Wikipedia for every currently active editor, it's hard not to argue that the pool of potential is very small.

Difficulty of writing a new article How many people out there remember writing papers in college? Remember how difficult it was to write a five-page paper -- let alone a 15-page term paper? Writing articles from scratch for Wikipedia is no easier. I remember organizing the Wikipedia article on King Arthur, and being thanked for sorting it out. And almost four years later, that structure is still almost entirely in place. I could point to a few other examples, but one challenge in writing an article is figuring out how to present the information.

Then there is the challenge of research. Repeating what one might have seen on the television last night will only get an editor so far; and a Google search will find not that much information. The time comes when a serious article-writer has to visit the local public library (or a univeristy library) to learn the information the article needs. And even then, a Wikipedian will encounter another barrier: sometimes the information is available only in a work the library does not have (e.g., anyone know where I can find a copy of the 1994 Ethiopian census?), sometimes the information may not be available in print & requires original research.

Limits on what to include Love it, hate it, or just wish we could find another way to figure out which articles we should include or exclude, some subjects will never have an article in Wikipedia. For example, there's a mailbox a few blocks from my house: although I can prove this object exists, I'll be surprised if it is ever the subject of its own article. Wikipedia is a reference work: if no one can be expected to want or need to read about it, it shouldn't have an article about it.

And there are only so many subjects worth an article on Wikipedia. For example, in Oregon, there are just over 90 state legislators: of these, at most a tenth of them at any time is worth an article on Wikipedia, but I suspect the number might be closer to one or two. Most of them, prinicpaled or hard-working as they might be, are doing little more than what their job requires -- and are not of interest to anyone who is not a consituent. And much the same can be said for people in many other fields.

And as I said above, sometimes one cannot find reliable information about a given subject. To put this another way, sometimes when I am researching a subject I reach a point where I am at the limit what is currently known; I feel as if I am standing at the shoulder of a given expert, watching her or him considering the evidence and in the middle of formulating a conclusion. This is not the cutting edge of knowledge: it is what is sometimes called the bleeding edge, where you gather the finest minds, explain the problem, and they tell you, "Well, this might work" -- but don't offer any guarrantee because they honestly don't know.

For example, I've been writing a lot of articles on the local districts of Ethiopia -- called woreda -- and I have often encountered situations where the information is incomplete or contradictory. As a result, I wrote articles about woredas that I suspect do not exist; after working with the material for a while, one comes to a point where one can read between the lines
and see what the actual facts are. Yet because the rules of Wikipedia, if there are reasonably reliable sources then I have to write the article -- even if the cites are wrong. I justify this approach by remembering that people look to reference works about misinformation: by showing where a given fact or statement comes from, they can determine for themselves that it is misleading.

This acknowledgement of limits to our current knowledge, I believe, is one of the justifications to replace the rule against Original Research with one about Verifiablity. An intelligent reader will accept that some points of knowledge are still undetermined; and knowing which ones are will help this reader evaluate other claims. Wikipedia therefore is meeting its mission in being a reference work.

This is weird: in a way, this argument defends the existence of stubs -- especially if they fail to mature into complete articles.

Geoff

Labels:


Saturday, November 11, 2006

 

An Interesting Entry

Kami Huyse admits that she's following Jimbo Wales' instructions by making a post at Talk: Mobile home. After a couple of days of silence, I made a post over at the Village Pump. I'll be watching to see what happens.

I know there are many Wikipedians who would be eager to be paid for working on Wikipedia; I've had a few admit this. Keep in mind that they aren't saying they will post anything to Wikipedia for enough money: they want to be paid for writing the kind of material that they are currently writing for free. If a business respects that we have ethics and a concept of best practices, that company will find a motivated -- and skilled -- pool of labor eager to work for them.


Geoff

Thursday, November 09, 2006

 

From a WikiEN-l thread

An example of a all-too-typical dispute over an article:

In the middle of a discussion about the need/difficulty of finding sources for popular culture subjects, Jimbo Wales complains about an article about Jeli Mateo. "The text is very much non-NPOV, and I have not yet done a check, but if I had to guess, it is a straight ripoff from another website," Jimbo rants. "This is a classic example of fancruft of the worst sort. There is virtually no chance that this article will ever improve".

After a number of readers respond to the charges of "copyright violation", and the use of the always contentious word "fancruft" (which means, in short, "crap only an extreme fan knows or cares about"), Phil Sandifer wonders if this will "cause problems with systemic bias, whereby American, Canadian, and British popular culture will all be far easier to write about than other countries due to the prevalence of English-language fandoms that generate sources?" Wales responds with the observation, "Will a high quality encyclopedia always be biased towards things that have high quality sources? Yes, I hope so. :)"

The is genuine concern about the possibility that Wikipedia is, far too often, written by nerds for other nerds -- all of whom live in North America or Europe; this occasionally leads to awkward misunderstandings like this one about what men think women want. Commenting about "high quality sources", I can attest that, based on my work with articles about Ethiopia, there are high-quality sources in many parts of the world that the average Wikipedian reader might not think have them. However, the difficulty of accessing these sources increases as a function of their distance from the nearest Internet access point.

"The vast majority of these topics I think would be better suited for a merge," observes Mindspillage. Then she adds that "I said something similar a year ago too, on this very list
(reproduced at http://en.wikipedia.org/wiki/User:Mindspillage/mergism for what it's worth), and I've mostly given up this issue..."

I wonder what the conversation would look like if Wales had picked Calling shotgun instead as his example.

BTW, the Jeli Mateo article has since been rewritten.

 

Who says Democrats don't have a plan?

It's not my intention to discuss politics in this blog (that's why I have an account at Daily Kos, where you can read my occasional thoughts on that subject), but I feel this thoughtful essay By Stirling Newberry over at TPM Cafe is worth a link.

Tuesday, November 07, 2006

 

Interesting read

My first click today on Google News Blog Search. (Yes, I was looking to see if my blog had been indexed there.)

 

Germany resurgent -- but it's a good thing

Germans have always had a reputation for technological innovation. My grandfather's brother -- the first person AFAIK in the family to have attended, let alone completed college -- attended the University of Stuttgart, received a degree in engineering (as well as a scar to show he belonged to one of the University's fencing clubs), & went on to help design & build the Panama Canal. (My grandfather also worked on the Panama Canal -- but as a foreman, overseeing a gang of African-Americans who did all of the work.)

Then there was that famous political mistake -- considerably more serious than the recent political mistake of the U.S. -- and Germany fell behind the US in that sector.

I'm glad to see that the German version of Wikipedia is contributing to a rebirth of that tradition of technological innovation. Elian's report details some of their recent activities.

Monday, November 06, 2006

 

The title of my blog

So why "Original research"?

I like to think that the idea of banning "original research" from Wikipedia was something I had a hand in formulating. At least it would appear that I did from this email. If this is not my most important contribution to Wikipedia, then I would consider it was the addition of a long list of awkwardly-written articles about every Emperor of Ethiopia. (If some don't read as if they were written with a shovel, that's because someone else came along behind me & fixed them.)

However, while researching the links for this entry, I discovered that the idea of banning "original research" predates that September, 2003 email. One example can be seen here.

So much for my claim to fame.

Sunday, November 05, 2006

 

Hawass lecture

This afternoon I attended Dr. Zahi Hawass' lecture at the Portland Art Museum, "New Discoveries at the Giza Pyramids and the Valley of the Kings". Hawass was in town to promote the opening of the exhibit of Egyptian art at the Museum.

Anecdotes from Hawass' lecture:


I forgot one item:

Saturday, November 04, 2006

 

Writing articles

Last night I found my notes of Jimbo Wales' keynote speech at this summer's Wikimania. I had forgotten his declared offensive to reduce the number of new articles (he used the phrase "the long tail") and increase the quality of the existing articles. Then again, I've been busy working on writing articles on each one of the woredas -- local districts -- of Ethiopia, & maybe my forgetfulness was intential.

The challenge that the "long tail" poses -- the phrase is a buzz word for all of the items or products that aren't the biggest sellers or most visible topics -- is that one person's trivia is another person's critical information. Not one Ethiopian woreda may ever be important enough to be included as one of Wikipedia's core articles -- but then, I doubt any single US county will ever be that important either, & Wikipedia currently has an article on each & every one of them.

And honestly, I am working on these articles because if I don't do it, it's very likely that no one else will write them, & even more likely that the few that are written won't be well written. And I've already seen evidence that by writing these short, admittedly incomplete accounts on these obscure administrative units are already attracting attention from other, more informed people. One person wrote an article about the woreda he grew up in, making an obvious attempt to follow the pattern I had set forth, which has been his entire contribution to the project. This is a phenomenon that Aaron Swatrz described: many contributions to Wikipedia are made to a small number of articles by someone only interested in a narrow subject, & whose interest in Wikipedia is complete once their contirbution is complete.

The cliche "herding cats" comes to mind. After all, people contribute the material to Wikipedia that they are interested in. Yet anyone who bothers to observe just how cats actually behave knows that they can be trained: this is how the mother cat teaches her kittens how to hunt and kill prey. And I have heard stories of people (who obviously have a great deal of patience) teaching their cats to play fetch -- or to use the toilet. A one-time contributor to Wikipedia won't read the extensive policy before making an edit -- but she or he will take the time to study how articles are written & follow examples.

Geoff

Friday, November 03, 2006

 

A real entry

The local paper, the Oregonian, mentioned Wikipedia in an editorial today. (I won't add the URL; I doubt the editorial is online, & they have this annoying habit of removing items more than 30 days old.) It was a surprise to see them mention something I've had a hand in for all of these years; I don't care how old one gets it's always a shock when something one considers "nothing all that special" is actually treated as something special.

 

E nihil ad blogo

Filler text created to see what my settings look like.

This page is powered by Blogger. Isn't yours? Site Meter