您的当前位置:首页正文

28万个开源项目之番外篇

来源:华拓网

一、工具

1. 数据抓取

当然,后续的331个项目的明细数据,还是得通过OpenHub的API来抓取。

2. 数据分析

完全是土法上马:sqlite3+numbers+csv+ruby,反正各种手法,什么称手用什么。

3. 数据展示

二、释疑:项目大小与创建时间的关系

我与@云风 在微博上有一小段讨论,起因还是我之前的一些分析的观点:

  • 是否使用Github,越是新的项目越愿意用;越是大的项目越没法用。
  • 是否使用Github来管理项目的issue,越是新的项目越愿意用;越是大的项目越没法用。

这个结论,其实在用词上,是有些讲究的:按理说,新与老相对,小与大相对;愿意与不愿意相对,能用与没法用相对,我的两个结论,对仗都不公整。其实,确实故意为之。

后来,我也没有再继续这个讨论,但是却一直在思考这个问题:「项目的大小,与项目的创建时间,究竟有大少相关性?」

后来,我将两个数据,做了一个分析:Log(第一次提交代码,至今的天数)/Log(代码行数),大概得到如下一个图:


经过强大的Excel的计算,两个数据的相关系数,大约是0.203的样子,也就是说:大致上有较弱的正相关。

三、开源

四、名单

331一个开源项目,名单如下:

Name Homepage
Metasploit Framework
NetBSD
GNU C Library
cURL
Python programming language
Linux Kernel
GNU Emacs
gnulib
GNU Core Utilities
GNU Compiler Collection
Wine
Debian
GNU Octave
Visualization Toolkit
pf
GDB
GNU binutils
GHC
Zope
FreeBSD
Perl
GNU LilyPond Music Typesetter
Gnus
ikiwiki
Samba
PHP
FreeBSD Ports
pkgsrc: The NetBSD Packages Collection
Mesa
Squid Cache
KDElibs (KDE)
gedit
Evolution
Kontact
KDE PIM
Advanced Linux Sound Architecture (ALSA)
Wireshark
OpenSSL
GIMP
NetBeans IDE
Koha Library Automation Package
openSUSE Linux
Doxygen
libcurl
GStreamer
GNOME
Insight Toolkit
zsh
Nautilus
X.Org
Mozilla Core
MariaDB
CMake
LibreOffice
ALT Linux
ParaView
GTK+
Poedit
Bugzilla
Enlightenment (window manager)
FFmpeg
GLib
PEAR
Ruby
GnuCash
phpMyAdmin
Mono
SWIG
SWT (Standard Widget Toolkit)
Checkstyle
Eclipse Java Development Tools (JDT)
Eclipse Platform Project
Natural Language Toolkit (NLTK)
Ekiga
Boost C++ Libraries
Kate (KDE)
Devhelp
Arch Linux Packages
SPIP
GNOME Terminal
ScummVM
Anjuta DevStudio
BlueZ
Eye of GNOME
Tor
Fedora Packages
Haiku
Stellarium
Totem
Rhythmbox
Gentoo Linux
CDT (Eclipse)
JRuby
eZ Publish
VLC media player
Equinox
Epiphany
Thunderbird
GeoTools
PyPy
KDE
apt - Advanced Package Tool
Moodle
Calligra Suite
QGIS
Mozilla Firefox
coreboot
Tiki Wiki CMS Groupware
Apache Maven 2
Plone
Superior Lisp Interaction Mode for Emacs
Kodi
MythTV
systemd
GeoServer
Groovy
Blender
MySQL
iproute2
MonoDevelop
Hibernate
NetworkManager
NLog - Advanced Logging
GParted
Seahorse
Glade User Interface Designer
Jenkins
IntelliJ IDEA Community Edition
Ruby on Rails
BusyBox
Evince
DokuWiki
Linux NTFS file system support
KVM
Battle for Wesnoth
Git
SPIP-Zone
Mercurial
Hibernate Entity Manager
Racket
RubyGems
SQLAlchemy
cabal
U-Boot
WebKit
OpenEmbedded
Yocto Project
matplotlib
Symfony
Meld
Haxe
FreeSWITCH
Geany
collectd
Gramps
phpBB Forum Software
HAProxy
fail2ban
NumPy
Scala
dpkg
Nette Framework
Inkscape
Phing
jBPM
JBoss Drools
Bitbake
Zotero
Lutece
OTRS
Sage: Open Source Mathematics Software
Rockbox
Liferay Portal
TYPO3 CMS
Vala
pylint
The LLVM Compiler Infrastructure
libvirt
TinyMCE
Django
PHPUnit
OpenStreetMap
SymPy
Xen Project (Hypervisor)
Eclipse Mylyn
PHP_CodeSniffer
Sakai LMS (core)
Spring Framework
Joomla!
Marble
LXDE
Pygments
OpenLayers
The MacPorts Project
calibre
Grails
Alfresco Content Management
util-linux
jQuery
Vaadin
Cython
Dojo Toolkit
MediaWiki
Second Life Viewer
Munin
Odoo
Mozilla Calendar
KDevelop
ZNC
Werkzeug
cppcheck
Wicket Stuff
Drush
Sphinx documentation builder
Piwik
JDownloader
SeaMonkey
Empathy
SilverStripe
PulseAudio
LLVM/Clang C family frontend
Pylons
MongoDB
Mockito
Doctrine
Pacman
MAME - Multiple Arcade Machine Emulator
Rubinius
Apache Camel
OpenJDK
Buildbot
MPD
Tracker
org-mode
Sass
WPA/WPA2/IEEE 802.1X Supplicant
Go programming language
Apache CouchDB
Qt 4
Apache CXF
CakePHP
CKeditor WYSIWYG editor
SciPy
gitg
Banshee
OGRE
Chromium (Google Chrome)
Gradle
Netty Project
Sinatra
Chef
Gerrit Code Review
GNOME Shell
Git Extensions
Qt Creator
Kohana v3
Android
JUnit
PCSX2
Shotwell
Redis
Cassandra
PhoneGap
Trinity Core
Icinga
CyanogenMod
Rygel
QEMU
Trinity Core2
Pitivi
Openfire
Apache Hadoop
akka
JGit
Homebrew
Oh My Zsh
ehcache
EGit
node.js (NodeJs)
Thunar
Selenium
Arquillian
Erlang
YUI
Gunicorn
CoffeeScript
Clementine Music Player
scikit learn
Processing
Vagrant
Qt 5
Yii PHP Framework
Zend Framework
Apache Spark
Flask
OsmAnd
ownCloud
Open Computer Vision Library (OpenCV)
phpDocumentor
IPython
RSpec
OpenStack
OpenStack Nova
Apache CloudStack
AngularJS
GWT (formerly Google Web Toolkit)
Facter
salt
jMonkey Engine
Puppet
Play! framework
Elasticsearch
Bootstrap (Twitter)
Apache OpenOffice
GlassFish
Propel
JabRef
CodeIgniter
GNOME Boxes
GitLab
TiddlyWiki
Fish shell
Ansible
Simple Machines Forum
FontForge
libgdx
py-pandas
javascript
EasyTAG
docker
Capistrano