Stats update

July 2nd, 2009

Stats time kids. We’ve had pushes for Mathematica, Ursala, Slate, and MATLAB in the last month. How will that affect the stats? Let’s go to the math!

pageviews0609

Yep, it’s pretty much the same. The Steady Eddie of all stats. Is that a variation there in the middle of June? No, I don’t think so.

viewsday0609

Still looks like a REEEEALLLY slow climb up overall, but still pretty much the same. I think those last three dips are all on weekends. You don’t have to restrict your RC use to when you’re bored at work. What’s more fun on a Saturday night than writing Multiplicative order in C?

pageedits0609

I think this is still dominated by bots. We really need to fix that. Can anyone help?

editsday0609

It looks like this one actually picked up a bit near the end of May and continued it through June. Take about 2400 off for bot over-editing (about 400 edits six times a day) after the end of March and you’ll see that we’re getting more edit activity.

viewsedit0609

It’s almost leveled off. Just imagine how it’ll change when the bot gets fixed.

Top ten tasks by all time views:

  1. IsNumeric ‎(19,115 views)
  2. Assigning Values to an Array ‎(15,796 views)
  3. Change string case ‎(15,081 views)
  4. Tokenizing A String ‎(14,808 views)
  5. Execute a System Command ‎(14,697 views)
  6. File I/O ‎(12,588 views)
  7. Sorting an Array of Integers ‎(10,915 views)
  8. Bubble Sort ‎(10,252 views)
  9. Creating an Associative Array ‎(10,156 views)
  10. Creating an Array ‎(10,121 views)

The bottom two swapped, but that’s it. Apparently visitors are most concerned with whether a string can be translated into a number or not.

I don’t think tasks by all time edits was really that informative. If you want it back, ask.

Top ten programming languages by number of examples:

  1. Tcl - 315
  2. Python - 277
  3. Ruby - 244
  4. Ada - 236
  5. C - 233
  6. Perl - 231
  7. OCaml - 212
  8. Java - 210
  9. AutoHotkey - 201
  10. Haskell - 199

D and ALGOL 68 got knocked off in favor of Ruby and AutoHotKey. AHK had a huge push once it came to RC. Tcl remained solidly on top.

PS: Thanks to Glennj for the Ruby example that helped me quickly get the top ten languages.

A mashup challenge

June 3rd, 2009

This is a coding challenge of a different sort. Take this XML export of Rosetta Code’s task pages, and create a mashup with the data. Post a link to your mashup in the comments below. If you don’t have convenient hosting for it, leave a note in the comments and someone will give you a hand, if possible.

If production of your mashup is/can be automated, please make a note of it; Regular exports of RC data are on the to-do list, and it would be pretty slick if interesting mashups could be generated in sync.

Be aware that the exported data linked to above is licensed under version 1.2 of the GNU Free Documentation License.

Stats update

May 31st, 2009

This one counts for double. Last month I had a lot of coursework to do so I didn’t do a stats update, but I still kept all the data. Let’s go to the charts:

pageviews0509So views have been growing pretty steadily since the stats posts have started. At some point, there will be more of a push for popularity.

viewsday0509The trendline shows a small increase over time, but  wouldn’t get in a fuss about it.

pageedits0509Can you guess when the bots started? Right now, the main editing bot, ImplSearchBot, edits way more pages than it needs to. A fix has been planned, but the owner of the bot hasn’t had time. Anyone who would like to help can talk to Short Circuit about it.

editsday0509This stat is heavily influenced by the bots as well. More human edits should be coming up with the addition of language feature parameters to the language template. It might be nice if someone could fit those features into a spiffy-looking div (talk to Short Circuit about your plans). Once that is all worked out, we may even get a new sidebar link.

viewsedit0509This one is taking a hard hit with the bots. We were over 60 back at the beginning of the year, but now we are down to single digits. We either need more people using the site, or fewer edits from the bots. Both would be best.

Now on to the top ten lists:

Ten most popular tasks by all-time views:

  1. isNumeric
  2. Assigning Values to an Array
  3. Change string case
  4. Tokenizing A String
  5. Execute a System Command
  6. File I/O
  7. Sorting an Array of Integers
  8. Bubble Sort
  9. Creating an Array
  10. Creating an Associative Array

Bubble Sort and Creating an Array have swapped.

Ten most popular tasks by all-time edits:

  1. Creating an Array
  2. IsNumeric
  3. Apply a callback to an Array
  4. Conditional Structures
  5. Empty Program
  6. String Character Length
  7. User Output - text
  8. User Input
  9. Control Structures
  10. Creating a Window

Sum of Array dropped off the list and Creating a Window entered at number ten. Empty Program, User Output - text,  and Conditional Structures moved up the list.

Top ten most prolific languages on RC by number of examples:

  1. Tcl 291
  2. Python 255
  3. Ada 224
  4. C 220
  5. Perl 213
  6. OCaml 203
  7. ALGOL 68 196
  8. Haskell 196
  9. Java 193
  10. D 182

Ruby (at 162) was pushed off the bottom of the list by Tcl coming out of nowhere and implementing all but four tasks on RC. All others stayed in the same order, but with more examples.

A quick challenge

May 12th, 2009

I don’t care if you’re a regular contributor or a lurker, here’s a quick challenge for you. Follow this link; It’ll take you to a random page on the site. Look at the page. Think of how it could be better. And then click on the “Discuss” link at the top of the page, and report your thoughts. Repeat as many times as you’d like.

If you’re so inclined, look at the Recent Changes page to see where other people might have done the same, look at what they’ve said, and respond.

Have at it!

Rosetta Code TODO list

April 24th, 2009

I’ve been averaging 70-90 hours of work per week for a few weeks now, and a lot of work on Rosetta Code has had to be put off.  So here’s a TODO list of accumulated things that need to done on Rosetta Code, and have either been in the works for a long time or have been planned. (And by “planned”, I mean that some of these things have been ideas that just won’t go away.)

At the top of the list; These things are either urgent or are already in the “pipeline”:

  • Mod Alias needs to be set up on RC, as mod_rewrite’s ‘+’ handling became broken in the switch form mod_php to fcgi, and we’ve back in the bad old days of C++ pointing to C.  At least there’s something we can do about it now…
  • ImplSearchBot needs to be fixed.  It’s editing almost 400 pages every four hours, when it only needs to be editing between one and ten, on average.
  • ImplSearchBot’s Subversion repository (where it keeps the JSON caches of category contents) needs to be opened up for general consumption.
  • ImplSearchBot’s Subversion repository needs to be abused to generate RSS feeds containing interesting events per language.
  • There are some bugs in the way Rosetta Code’s syntax highlighting deals with leading whitespace.  Details are in the relevant Village Pump page.  There also appears to be a bug breaking Unicode support with at least some languages when dealing with the string “møøse”.  Not sure why this would be.
  • Need to finish RC promo video.  Looking for suitable audio to sync.
  • Find out what causes Recent Changes RSS feed to spit out batches of duplicate items a couple times a week.
  • Rewrite the Rosetta theme from scratch.

Things that I want to start on:

I’d like to see a bit of a shift away from theoretical tasks to practical tasks, and move from explicitly contrasting languages to identifying where a language’s abilities can be taken advantage of for things that programmers often need to do.

If anyone wants to give these a try (especially creating more tasks, creating RC promo material, or anything that requires a bot), go ahead, give it a shot!  I haven’t had a whole lot of time of late.

(Yes, I know there are a lot of links; It comes from having a lot of proper nouns and other interesting concepts…)

Downtime resolved

April 20th, 2009

I figured out what killed the server Saturday–It was ImageMagick.  The ALGOL 68 Dragon Curve animated GIF is fairly large.  Someone went to visit the GIF’s page on RC, and MediaWiki ran ‘convert’ to create a thumbnail.

MediaWiki ran (approximately; the paths and filenames have been changed to protect the server, and because I tested it locally) ran was:

convert -background white -size 781 ALGOL_68_Dragon_curve_animated.gif -coalesce -thumbnail '781x599!' -depth 8 out.gif

That command, with that data, takes several seconds to run on my Phenom 9650 desktop at home, and my machine has a darn sight more CPU available to programs than anything running within RC’s VPS slice.  As a result, when MediaWiki ran convert, it, via mod_php, via Apache spent several seconds trying to generate a thumbnail.  It would have eventually finished, except that whoever was looking at the page got bored, and refreshed.  Several times.  When I discovered that the server was having issues, the server load average was up around 16.  RC’s slice usually hovers between 0.00 and 0.15.

Three things have been implemented to fix this problem.  First, I’ve switched from mod_php to FastCGI, with the assistance of some of the folks in #mediawiki on FreeNode.  As a result, we get back an HTTP 500 ISE when commands run by the server take longer than expected to complete. (For whatever reason, mod_php was simply hanging.)  Second, I’ve turned off thumbnails.  None of the images on the site are large enough to make them worthwhile, for the time being.  I’ll look into backgrounding convert processes in a way that doesn’t take down the site, but that’s going to be low-priority for now. The third bit came as part of my debugging.  All external commands run by MediaWiki will now be logged, to leave some sort of trace for when this kind of problem happens again.

Downtime

April 18th, 2009

Sorry about the downtime. I’m not entirely certain what the cause was, but the fix has been to switch from mod_php to fcgi, and correct a few caching settings in MediaWiki. My best guess is that a high-traffic site linked to (or embedded) the Algol 68 Dragon Curve animated GIF thumbnail, which was apparently causing several hung instances of ImageMagick’s convert tool. I’ll know more when I have time to look at the logs and analytics data tonight.

Calling the POD People

April 18th, 2009

Hello, it’s me, Mike Mol.  I’m writing here today because I’d like to do something, and I don’t know how to do it.  While that’s generally the case for the folks who visit Rosetta Code, this particular question can’t be solved by comparing two or more programming languages, or by putting up a Task and seeing how other people do it.

I would like to extend Rosetta Code to print.  As in bound hard-copy dead trees.  I’d like for Rosetta Code to sell one or more books that take a few languages, show those languages side-by-side for various tasks, and, of course, list all the contributors to those code samples, and include a URL where the print-ready PDF is available. (It is GFDL content, after all.)

To go from print-ready PDF to an actual hardcopy book, I need a printer.  To avoid dealing with sales, I need a publisher. (I have absolutely no interest in dealing with the headaches revolving around processing payments from PayPal, checks or any other payment vendor, canceled checks, refunds, returns, you name it.)  So far, POD services like LuLu would seem to be the best option.

Of course, I’d like to avoid as much editorial and layout work as possible, so I’ll most likely automate the entire process–and therein lies the problem; Any time a new book is ready to go out, I’ll need to upload it and set everything up.  I would much, much, much prefer to be able to use a service where an API allows me to programmatically do all the work I would otherwise have to do by hand.  I have no complaints about writing the code on my end to interact with such a thing.

If I can get that far, then the skies open up with a realm of possibilities. I could provide a page where anyone could request that a book be published compares the languages they’d like to see compared, or which includes all of the tasks that they’re interested in.  They click Submit, the script spends a day or two preparing the book, and then the PDF and book get published simultaneously, and they can have their copy (or copies) within a week or two.

There are a number of cases where somebody might want one (or more) hardcopies This could be particularly valuable to teachers who want to contrast a set of languages, showcase a particular language, provide example code for a number of algorithms or other problems, etc.

Yes, there’s obviously concern about ill-timed vandalism.  Each PDF would probably be held for a little while before being sent on to the publisher.  That would mostly be a case of watching the pages that were included for signs that they’d been vandalized. (Rosetta Code has an awesome community for watching for vandalism.)  I suppose somebody could get themselves listed as a contributer under an obscene name, but I can work around that, too.

Stats update

March 31st, 2009

The good news: we are getting more views per day. The bad news: bots do crazy things to edit counts. Trendlines in black as usual.

pageviews0409

As you can see, we have a nice steady increase in page views. We should hit 2M sometime soon.

viewsday0409

The trendline shows a small increase in views per day. I have also noticed a few new users recently. The site may be getting more popular.

pageedits0409

ImplSearchBot really took over at the end of the month. I thought previously that edits that didn’t change anything weren’t counted, but that’s not how it works with API calls. Edit stats will be skewed until this can be fixed.

editsday0409

viewsedit0409

Ten most popular tasks by all-time views:

  1. isNumeric
  2. Assigning Values to an Array
  3. Change string case
  4. Tokenizing A String
  5. Execute a System Command
  6. File I/O
  7. Sorting an Array of Integers
  8. Creating an Array
  9. Bubble Sort
  10. Creating an Associative Array

Empty Program has moved off the bottom of the list and Sorting an Array of Integers has shot into 7th.

Ten most popular tasks by all-time edits:

  1. Creating an Array
  2. Apply a callback to an Array
  3. IsNumeric
  4. String Character Length
  5. Conditional Structures
  6. Control Structures
  7. User Input
  8. Empty Program
  9. Sum of Array
  10. User Output - text

Top ten most prolific languages on RC by number of examples:

  1. Python 230
  2. Ada 216
  3. OCaml 200
  4. C 196
  5. Perl 192
  6. Haskell 189
  7. Java 179
  8. D 174
  9. ALGOL 68 167
  10. Ruby 156

D moved quickly into 8th and pushed Forth off the list.

Stats update

March 2nd, 2009

It’s that time of the month. Time to pay rent and time for stats. This month new bots have pushed up edit counts and posts on programming reddit may have pushed up view counts.

(Apologies for weird, different-looking graphs this time around. I had to use another computer to make them.)

pageviews0309

Trend lines have been added for these graphs in black just for more information. Regular bots (starting mid-February) have affected edits stats. The addition of the proggit template may have affected viewership, though.

viewsday0309

pageedits0309

editsday0309

viewsedit0309

Ten most popular tasks by all-time views:

  1. isNumeric
  2. Assigning Values to an Array
  3. Change string case
  4. Tokenizing A String
  5. Execute a System Command
  6. File I/O
  7. Creating an Array
  8. Bubble Sort
  9. Creating an Associative Array
  10. Empty Program

Ten most popular tasks by all-time edits:

  1. Creating an Array
  2. Apply a callback to an Array
  3. IsNumeric
  4. String Character Length
  5. Control Structures
  6. Conditional Structures
  7. Sum of Array
  8. Empty Program
  9. User Output - text
  10. User Input

Top ten most prolific languages on RC by number of examples:

  1. Python 222
  2. Ada 209
  3. OCaml 194
  4. Perl 187
  5. Haskell 183
  6. C 182
  7. Java 173
  8. Ruby 155
  9. ALGOL 68 148
  10. Forth 141

Keep those programming examples coming. The world must learn.