一、工具
1. 数据抓取
当然,后续的331个项目的明细数据,还是得通过OpenHub的API来抓取。
2. 数据分析
完全是土法上马:sqlite3+numbers+csv+ruby,反正各种手法,什么称手用什么。
3. 数据展示
二、释疑:项目大小与创建时间的关系
我与@云风 在微博上有一小段讨论,起因还是我之前的一些分析的观点:
- 是否使用Github,越是新的项目越愿意用;越是大的项目越没法用。
- 是否使用Github来管理项目的issue,越是新的项目越愿意用;越是大的项目越没法用。
这个结论,其实在用词上,是有些讲究的:按理说,新与老相对,小与大相对;愿意与不愿意相对,能用与没法用相对,我的两个结论,对仗都不公整。其实,确实故意为之。
后来,我也没有再继续这个讨论,但是却一直在思考这个问题:「项目的大小,与项目的创建时间,究竟有大少相关性?」
后来,我将两个数据,做了一个分析:Log(第一次提交代码,至今的天数)/Log(代码行数),大概得到如下一个图:
经过强大的Excel的计算,两个数据的相关系数,大约是0.203的样子,也就是说:大致上有较弱的正相关。
三、开源
四、名单
331一个开源项目,名单如下:
Name | Homepage |
---|---|
Metasploit Framework | |
NetBSD | |
GNU C Library | |
cURL | |
Python programming language | |
Linux Kernel | |
GNU Emacs | |
gnulib | |
GNU Core Utilities | |
GNU Compiler Collection | |
Wine | |
Debian | |
GNU Octave | |
Visualization Toolkit | |
pf | |
GDB | |
GNU binutils | |
GHC | |
Zope | |
FreeBSD | |
Perl | |
GNU LilyPond Music Typesetter | |
Gnus | |
ikiwiki | |
Samba | |
PHP | |
FreeBSD Ports | |
pkgsrc: The NetBSD Packages Collection | |
Mesa | |
Squid Cache | |
KDElibs (KDE) | |
gedit | |
Evolution | |
Kontact | |
KDE PIM | |
Advanced Linux Sound Architecture (ALSA) | |
Wireshark | |
OpenSSL | |
GIMP | |
NetBeans IDE | |
Koha Library Automation Package | |
openSUSE Linux | |
Doxygen | |
libcurl | |
GStreamer | |
GNOME | |
Insight Toolkit | |
zsh | |
Nautilus | |
X.Org | |
Mozilla Core | |
MariaDB | |
CMake | |
LibreOffice | |
ALT Linux | |
ParaView | |
GTK+ | |
Poedit | |
Bugzilla | |
Enlightenment (window manager) | |
FFmpeg | |
GLib | |
PEAR | |
Ruby | |
GnuCash | |
phpMyAdmin | |
Mono | |
SWIG | |
SWT (Standard Widget Toolkit) | |
Checkstyle | |
Eclipse Java Development Tools (JDT) | |
Eclipse Platform Project | |
Natural Language Toolkit (NLTK) | |
Ekiga | |
Boost C++ Libraries | |
Kate (KDE) | |
Devhelp | |
Arch Linux Packages | |
SPIP | |
GNOME Terminal | |
ScummVM | |
Anjuta DevStudio | |
BlueZ | |
Eye of GNOME | |
Tor | |
Fedora Packages | |
Haiku | |
Stellarium | |
Totem | |
Rhythmbox | |
Gentoo Linux | |
CDT (Eclipse) | |
JRuby | |
eZ Publish | |
VLC media player | |
Equinox | |
Epiphany | |
Thunderbird | |
GeoTools | |
PyPy | |
KDE | |
apt - Advanced Package Tool | |
Moodle | |
Calligra Suite | |
QGIS | |
Mozilla Firefox | |
coreboot | |
Tiki Wiki CMS Groupware | |
Apache Maven 2 | |
Plone | |
Superior Lisp Interaction Mode for Emacs | |
Kodi | |
MythTV | |
systemd | |
GeoServer | |
Groovy | |
Blender | |
MySQL | |
iproute2 | |
MonoDevelop | |
Hibernate | |
NetworkManager | |
NLog - Advanced Logging | |
GParted | |
Seahorse | |
Glade User Interface Designer | |
Jenkins | |
IntelliJ IDEA Community Edition | |
Ruby on Rails | |
BusyBox | |
Evince | |
DokuWiki | |
Linux NTFS file system support | |
KVM | |
Battle for Wesnoth | |
Git | |
SPIP-Zone | |
Mercurial | |
Hibernate Entity Manager | |
Racket | |
RubyGems | |
SQLAlchemy | |
cabal | |
U-Boot | |
WebKit | |
OpenEmbedded | |
Yocto Project | |
matplotlib | |
Symfony | |
Meld | |
Haxe | |
FreeSWITCH | |
Geany | |
collectd | |
Gramps | |
phpBB Forum Software | |
HAProxy | |
fail2ban | |
NumPy | |
Scala | |
dpkg | |
Nette Framework | |
Inkscape | |
Phing | |
jBPM | |
JBoss Drools | |
Bitbake | |
Zotero | |
Lutece | |
OTRS | |
Sage: Open Source Mathematics Software | |
Rockbox | |
Liferay Portal | |
TYPO3 CMS | |
Vala | |
pylint | |
The LLVM Compiler Infrastructure | |
libvirt | |
TinyMCE | |
Django | |
PHPUnit | |
OpenStreetMap | |
SymPy | |
Xen Project (Hypervisor) | |
Eclipse Mylyn | |
PHP_CodeSniffer | |
Sakai LMS (core) | |
Spring Framework | |
Joomla! | |
Marble | |
LXDE | |
Pygments | |
OpenLayers | |
The MacPorts Project | |
calibre | |
Grails | |
Alfresco Content Management | |
util-linux | |
jQuery | |
Vaadin | |
Cython | |
Dojo Toolkit | |
MediaWiki | |
Second Life Viewer | |
Munin | |
Odoo | |
Mozilla Calendar | |
KDevelop | |
ZNC | |
Werkzeug | |
cppcheck | |
Wicket Stuff | |
Drush | |
Sphinx documentation builder | |
Piwik | |
JDownloader | |
SeaMonkey | |
Empathy | |
SilverStripe | |
PulseAudio | |
LLVM/Clang C family frontend | |
Pylons | |
MongoDB | |
Mockito | |
Doctrine | |
Pacman | |
MAME - Multiple Arcade Machine Emulator | |
Rubinius | |
Apache Camel | |
OpenJDK | |
Buildbot | |
MPD | |
Tracker | |
org-mode | |
Sass | |
WPA/WPA2/IEEE 802.1X Supplicant | |
Go programming language | |
Apache CouchDB | |
Qt 4 | |
Apache CXF | |
CakePHP | |
CKeditor WYSIWYG editor | |
SciPy | |
gitg | |
Banshee | |
OGRE | |
Chromium (Google Chrome) | |
Gradle | |
Netty Project | |
Sinatra | |
Chef | |
Gerrit Code Review | |
GNOME Shell | |
Git Extensions | |
Qt Creator | |
Kohana v3 | |
Android | |
JUnit | |
PCSX2 | |
Shotwell | |
Redis | |
Cassandra | |
PhoneGap | |
Trinity Core | |
Icinga | |
CyanogenMod | |
Rygel | |
QEMU | |
Trinity Core2 | |
Pitivi | |
Openfire | |
Apache Hadoop | |
akka | |
JGit | |
Homebrew | |
Oh My Zsh | |
ehcache | |
EGit | |
node.js (NodeJs) | |
Thunar | |
Selenium | |
Arquillian | |
Erlang | |
YUI | |
Gunicorn | |
CoffeeScript | |
Clementine Music Player | |
scikit learn | |
Processing | |
Vagrant | |
Qt 5 | |
Yii PHP Framework | |
Zend Framework | |
Apache Spark | |
Flask | |
OsmAnd | |
ownCloud | |
Open Computer Vision Library (OpenCV) | |
phpDocumentor | |
IPython | |
RSpec | |
OpenStack | |
OpenStack Nova | |
Apache CloudStack | |
AngularJS | |
GWT (formerly Google Web Toolkit) | |
Facter | |
salt | |
jMonkey Engine | |
Puppet | |
Play! framework | |
Elasticsearch | |
Bootstrap (Twitter) | |
Apache OpenOffice | |
GlassFish | |
Propel | |
JabRef | |
CodeIgniter | |
GNOME Boxes | |
GitLab | |
TiddlyWiki | |
Fish shell | |
Ansible | |
Simple Machines Forum | |
FontForge | |
libgdx | |
py-pandas | |
javascript | |
EasyTAG | |
docker | |
Capistrano |