Thursday, December 20

Final Results of Informal Programming Language Comparison.

A recent post on programming.reddit.com about profanity in comments was highly amusing if you are childish enough to be amused by such things - I am, anyway - set me to thinking about the Google Code search thing and whether it could be used in any profitable way to compare programming languages. I quicky hit upon the hypothesis that programmers content with their language would be more likely to type comments along the lines of "this rocks" and ones more disaffected might type comments along the lines of "this sucks".


Hence, all that was required was to write a bit of code to tabutlate the search results by language and a clear winner would emerge. It did, and it wasn't the one I was expecting.


My final choice of metric was the ratio of incidence of the word "sucks" and the word "rocks" scaled by the frequency of a netural control word such as "okay" (and even so, poor old FORTH did not get a lookin as it got zero hits on any of these). I call this the suckage to rockage ration and henceforth regard it as the gold standard of programming metrics. Without further ado, here are the charts and the code.




Controversy is roughly the distance between suckage and rockage - high if the language arouses strong feelings, low if it's boring, staple stuff.




C!? I really didn't expect this. It might be due to the inability of the search to separate out C and C++. It's almost certianly due to a low suckage rather than a high rockage.



The code, which is Public Domain, if anyone is inclined to tinker.



(asdf:oos 'asdf:load-op 'drakma)
(asdf:oos 'asdf:load-op 's-xml)
(asdf:oos 'asdf:load-op 'clot)

(defpackage :code-index (:use :cl :clot :drakma :s-xml))

(in-package :code-index)

(defun code-search (regexp &key language license file package (output-type :sxml))
(let ((url (concatenate 'string
"http://google.com/codesearch/feeds/search?q="
(when language (concatenate 'string "lang:" language "+"))
(when license (concatenate 'string "license:" license "+"))
(when file (concatenate 'string "file:" file "+"))
(when package (concatenate 'string "package:" package "+"))
regexp)))
(multiple-value-bind (body-or-stream status-code headers uri stream must-close reason-phrase)
(drakma::http-request url
:force-binary t)
(declare (ignore headers must-close stream reason-phrase))
(format t "Request for ~A~%" uri)
(format t "Status code ~A~%" status-code)
(parse-xml-string (flexi-streams:octets-to-string body-or-stream :external-format (flexi-streams:make-external-format :utf-8)) ))))

;; google returns a malformed string every time - wallies!
;;(code-search "sucks" :language "pascal")

(defun code-search-hit-count (results)
(parse-integer (cadr (nth 5 results))))

(defun calculate-index-for-language (lang)
(format t "~&For Langauge : ~A~&" lang)
(let* ((control-index
(code-search-hit-count (code-search "okay" :language lang)))
(suckage-index
(/ (code-search-hit-count (code-search "sucks" :language lang))
control-index))
(rockage-index
(/ (code-search-hit-count (code-search "rocks" :language lang))
control-index)))
(format t "Suckage index ~D~&" suckage-index)
(format t "Rockage index ~D~&" rockage-index)
(format t "Controversy index ~F~&" (sqrt (+ (* rockage-index rockage-index) (* suckage-index suckage-index))))
;; ;; admittedly it's a bust if no one ever says that language Y sucks, but how probable is that ;-)
;; ;; but forth managed it...
(format t "Suckage/Rockage ratio ~D~&" (/ rockage-index suckage-index))
(list control-index suckage-index rockage-index
(sqrt (+ (* rockage-index rockage-index) (* suckage-index suckage-index)))
(/ rockage-index suckage-index))))


(defun compare-languages ()
(let* ((language-list '("c" "ruby" "perl" "pascal" "erlang" "javascript" "java" "smalltalk" "python" "lisp" "haskell" "ocaml"))
(language-results (mapcar #'calculate-index-for-language language-list))
(control-list (append (list "Frequency" "brown") (mapcar #'(lambda (x) (nth 0 x)) language-results)))
(suckage-list (append (list "Suckage" "red") (mapcar #'(lambda (x) (nth 1 x)) language-results)))
(rockage-list (append (list "Rockage" "green") (mapcar #'(lambda (x) (nth 2 x)) language-results)))
(controversy-list (append (list "Controversy" "blue") (mapcar #'(lambda (x) (nth 3 x)) language-results)))
(suckage/rockage-list (append (list "Rockage to Suckage ratio" "yellow") (mapcar #'(lambda (x) (nth 4 x)) language-results))))
(cl-gd:with-image* (640 480)
(fill-image 0 0 :color "white")
(plot-bar-chart (list suckage-list rockage-list controversy-list) :x-axis-labels language-list :bar-width .8 :vgrid t)
(cl-gd:write-image-to-file
(make-pathname :defaults clot-system:*base-directory* :name "languages" :type "png") :if-exists :supersede))
(cl-gd:with-image* (640 480)
(fill-image 0 0 :color "white")
(plot-bar-chart (list suckage/rockage-list) :x-axis-labels language-list :bar-width .8 :vgrid t)
(cl-gd:write-image-to-file
(make-pathname :defaults clot-system:*base-directory* :name "suckage-to-rockage" :type "png") :if-exists :supersede))))

No comments: