As a source-based Linux distribution which relies heavily on a decentralised development model, we’re always interested in interesting new people to work on Exherbo. Any task, no matter how large or small can help you become recognised in the free/open source eco system, will offer you interesting insights into how we at Exherbo work but also how many other projects get things done and, last but not least, you’ll most likely have lots of fun with us. :-)
Below, you’ll find project suggestions that can get you started; be it for Google’s summer of code or eternal fame and glory! ;-)
Please contact Heiko “heirecka” Becker at heirecka@exherbo.org or Wulf “Philantrop” Krueger at philantrop@exherbo.org if you’re interested in any of these projects for further information. We can usually both be found in the #exherbo channel on Libera.Chat.
Paludis is a multi-format package manager. So far we’ve mostly made use of this to deal with ‘special’ repository types like unwritten and unavailable. It would be good to integrate native support for other external repository formats into Paludis and Exherbo:
Full package manager integration for Gems isn’t something that’s been done by anyone on a production scale before. The Gems people seem interested, though.
Attempts have been made at integrating Gems support into Paludis in the past. The idea has been shown to be implementable and sound; the main stumbling blocks have been:
The Gems metadata YAML file isn’t valid YAML, and can only be parsed by one obscure, unmaintained parser. See Ciaran’s blog post about it for details.
Gems external dependencies aren’t in any useful format. Although we can mostly get by by having the user work these out by hand where necessary, ideally we’d have some way of letting developers override these.
How to handle distribution integration. Previous attempts were targeting Gentoo, where making mass changes to everything with Gems dependencies wasn’t an option. With Exherbo such a mass change is reasonably easy – you’d have to work out what a Gems dependency would look like, but the mass changes can be carried out by our friendly gnomes.
Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton
Difficulty: Medium to high (depending on the type of repository)
Prerequisites: Decent knowledge of C++.
Since we’re using git for our exheres, patches, etc. we don’t really need checksum validation for those. It would be really nice to have checksums for DISTFILES, though. The project would be about figuring out how such checksums would be implemented (manifest files? They must not impair our git workflow, though.) and to actually develop the solution agreed upon in Paludis, our package manager.
The Paludis side of this shouldn’t be too hard because basically a case of saying “don’t create manifest entries for exhereses etc”. The problem is integrating it into the development workflow in a clean manner. This one’s doable without too much C++ knowledge.
Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen
Difficulty: Easy
Prerequisites: Some knowledge of C++, some real world git experience
Our current way to handle configuration files (primarily in /etc but generally any configuration) is based upon configuring paths in which files aren’t simply overwritten but into which the package manager installs newer/changed versions of existing files as dotfiles that have to be merged with existing configuration later on by the user. Getting a fresh, clean set of configuration without (unnecessarily) re-installing an entire package, smart configuration updates, exheres-supported configuration changes, etc. aren’t easily possible with this system. This project is about implementing a new configuration handling system. A good starting point for ideas is this email on our development mailinglist:
http://lists.exherbo.org/pipermail/exherbo-dev/2008-May/000137.html
Potential mentors: Ciaran “ciaranm” McCreesh
Difficulty: Medium
Prerequisites: Decent knowledge of the programming language to be used
Our current tool for managing configuration files in /etc is rather limited and not exactly user-friendly. This project is about working out what a replacement should look like and implement it. The resulting tool should have pluggable backends and needs to work with at least git (with the option to use other CM tools if someone creates a plugin).
A good starting point for ideas is this email on our development mailinglist:
http://lists.exherbo.org/pipermail/exherbo-dev/2012-February/001040.html
Potential mentors: Bo “zlin” Ørsted Andresen, Wulf “Philantrop” Krueger, Ciaran “ciaranm” McCreesh
Difficulty: Medium
Prerequisites: Decent knowledge of the programming language to be used
Version formats are currently limited to a fairly strict set of rules. These are mostly historical things we’ve inherited from Gentoo. For example, 1.2-3, 1.2B and 1.2-alpha3 are not valid versions, but 1.2.3, 1.2b and 1.2_alpha3 are; if upstreams use the former, packages have to jump through silly hoops to work around it. There’s no particular reason to keep these limitations.
This project would consist of:
Identifying which restrictions can and should be dropped. http://lists.exherbo.org/pipermail/exherbo-dev/2009-February/000400.html is a starting point.
Identifying which restrictions need to be kept, and which upstreams we shouldn’t try to handle. Although technically we could deal with roman numerals, for example, it’d probably be silly to do so.
Working out a new set of rules, including ordering. (From a Paludis perspective, and for general sanity reasons, we have to be able to less-than compare any two arbitrary versions. If a < b and b < c, a < c must hold. Similarly, if a < b, ! b < a. It’s ok to have two different-but-equal versions – currently, 000 is equal to 0.)
Updating Paludis to handle these new versions. From a public API perspective, you would have to give VersionSpec’s constructor a new parameter that tells it which set of parsing rules to use. You would then need to update every caller to pass such a parameter (fortunately, the compiler will find everywhere that needs changing for you), which in turn would require extending the EAPI definitions (this is the easy part). You would also have to extend or rewrite the current version parser and comparison rules. And, of course, you would need lots of unit tests.
Updating the various Exherbo packages that you previously identified as using ‘wrong’ version names to use the new rules.
Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen
Difficulty: Medium
Prerequisites: Applicants would need either a decent knowledge of C++, or a basic knowledge of C++ and a willingness to put up with grouchy programmers who like to yell “needs more unit tests!” and “not enough error checking!”.
We’re using categories to group packages into (more or less) logical While this approach is fairly easy, it falls short in many respect (e. g. does Apache httpd belong to www-servers or net-www?). There have been several suggestions about how to get rid of categories and replace them with something more versatile, e. g. tags (like Freecode).
For this project, the following things need to be worked out:
How packages are to be specified in exheres.
An on-disk format to store them in repositories. This has performance implications because if too many packages end up in one directory, performance of paludis degrades.
How to deal with collisions in package names.
The last two promising proposals were: http://lists.exherbo.org/pipermail/exherbo-dev/2009-January/000362.html and http://lists.exherbo.org/pipermail/exherbo-dev/2012-October/001162.html.
Potential mentors: Ciaran “ciaranm” McCreesh, Wulf “Philantrop” Krueger
Difficulty: Easy
Prerequisites: Some knowledge of C++ for package manager integration
Develop a client using the Paludis library to extract metadata from Exherbo packages and check for new upstream versions. The client should be able to deliver the reports using several different interfaces. At minimum it should be able to generate a text based report locally as well as be able to mail the report and show it using a web interface.
Furthermore it should be possible to configure the client to generate reports for a subset of package repositories, package categories or specific packages.
For bonus points the client should also be able to show related data such as ChangeLogs or Release Notes when Exherbo packages contain enough metadata for that.
Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen
Difficulty: Medium to high (due to broad range potential data sources and formats)
Prerequisites: Decent knowledge of Ruby, Python or C++ and some knowledge of website development
Build an application that can create image files useful for installing Exherbo or testing it.
The application should be able to build a number of different formats such as:
CDROM/DVD images
USB stick images
Images for virtualization products such as kvm or vmware
Users should be able to specify the partition layout, which packages to install in the image as well as any needed configuration for boot loaders, kernel configuration etc.. The application should work for both i686 and x86_64 based images and be designed in such a way that it’ll be possible to add further architecture support to it at a later point.
This project can be implemented in just about any language but the obvious choices would probably be scripting languages like Bash, Ruby or Python. Experience with partition and bootloaders would be an advantage but not a requirement.
We’ve added a basic raw image building script called create-kvm-image written in bash. It works for most basic cases but needs a lot more features. It should be relatively easy to add most of these features to the script.
Features wanted
Ability to specify filesystem types for each partitition. Right now it’s all hardcoded
Some way of specifying partition layout - the number of partitions, sizes of each and mount points
Allow the user to specify his own kernel configuration instead of the mostly defconfig derived hardcoded configuration
Possibly rewrite (parts of) the script using libguestfs. This would also allow us to relatively easy run commands in the guest image as part of build process
Potential mentors: Bryan “kloeri” Østergaard, Wulf “Philantrop” Krueger
Difficulty: Easy to medium (depending on the intended scope of the project)
Prerequisites: Decent knowledge of shell scripting, disk imaging tools
While we believe in our rigorous peer-review process which is working well, a tool that checks at least for the most basic quality issues in our exheres would be useful. We used to do this using a Paludis client (qualudis) which was based on a per-package model and had some fairly severe drawbacks.
Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen, Wulf “Philantrop” Krueger
Difficulty: Medium
Prerequisites: Experience in formal quality assurance, some knowledge of C++ for package manager integration
While we strongly believe in up-front configuration, there are usecases for some optional interactive behaviour of the package manager. This is especially true for the ‘cave resolve’ command that basically installs and uninstalls packages. Not only can it take the package manager (PM) quite some time to figure out larger resolutions but this might occur twice (e. g. when the user makes the PM first only pretend to do something and eventually lets it actually do it).
Furthermore, recommendations (taken automatically by default) and suggestions (displayed only by default) come into this as well. Having a “staging area” in the git sense would make this much more interesting. Thus, this project is about creating an interactive mode for ‘cave resolve’, based upon ‘git add -i’.
Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen
Difficulty: High
Prerequisites: Decent knowledge of C++, or a basic knowledge of C++ and a willingness to put up with grouchy programmers who like to yell “needs more unit tests!” and “not enough error checking!”.
Currently, our profiles can’t be altered from (third-party) repositories. This is especially unfortunate in the case of sub-options and the like which we have to add to (usually) arbor even though they’re needed in some third-party repository only.
Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen
Difficulty: Medium
Prerequisites: Decent knowledge of C++, experience with refactoring code and writing unit tests.
Traditionally, we’ve been using so-called stage tarballs, containing essential pre-compiled packages (toolchain, package manager, etc.) to have an initial, minimal environment upon which to build a complete Exherbo installation. Now that we have working and properly documented binary packages (pbins) it’s now possible to replace these stages with something smarter. This one’s mostly about figuring out how to deal with things that’re currently done in pkg_* phases.
Potential mentors: Bo “zlin” Ørsted Andresen, Ciaran “ciaranm” McCreesh, Wulf “Philantrop” Krueger
Difficulty: Easy
Prerequisites: -
Some software has its own external kernel modules (e. g. VirtualBox, VMWare, nvidia-drivers, etc.). Currently, we simply install their sources to /usr/src and tell the user to compile them. While this works, of course, this is something that could certainly be done more nicely with some support inside the distribution, especially since such modules may need to be built for several different kernel versions.
Potential mentors: Bo “zlin” Ørsted Andresen, Wulf “Philantrop” Krueger
Difficulty: Easy to medium (depending on the system to be chosen)
Prerequisites: -
There’re some packages, like dev-perl/autodie, dev-perl/Module-Build, etc, which exist both as separate packages, and as part of core perl itself. To make things more confusing, not all versions of perl contain them. And you can’t even just do a simple “>=5.10” for some, since there’s 5.8.9 which came out later.
So, one solution that has been suggested is to add 2 new metadata keys, CONTAINS and CONTAINED_IN. They would each be space-separated lists of packages.
CONTAINS would be for dev-lang/perl, and would look something like:
dev-lang/perl/perl-5.10.1.exheres-0:
...
CONTAINS="dev-perl/Module-Build[=0.340201] ..."
...
# Not sure I like using the [=version] syntax here, but not sure of a
# better one.
Then, we would have a dev-perl/Module-Build like:
dev-perl/Module-Build/Module-Build-0.340201.exheres-0
...
CONTAINED_IN="dev-lang/perl[=5.10.1]"
...
Finally, a package that needs a particular version of Module-Build would simply dep upon it as if it were just a stand-alone package:
dev-perl/foo/foo-123.exheres-0
DEPENDENCIES="
build:
dev-perl/Module-Build[>=0.340201]
...
"
If perl 5.10.1 is already installed, and 0.340201 is the best available version of Module-Build around, then nothing will need to be pulled in. Otherwise, the newest Module-Build would be installed.
Potential mentors: David “dleverton” Leverton
Difficulty: Easy
Prerequisites: -
We currently use RESTRICT as inherited from EAPI 0. RESTRICT=test is used the most, to prevent broken tests from running, but possible values are also userpriv and strip, default_src_test, tries to detect if there is a Makefile that has a check/test target and runs that if possible. When RESTRICT=test is set, the src_test phase isn’t run. There are a few downsides to this approach:
Paludis can’t tell if a package has a test suite in advance.
We don’t want to RESTRICT tests when a package doesn’t have them because it could get added later without anyone noticing.
With RESTRICT=test, src_test_expensive still runs. If we just removed RESTRICT=test, one would have to define an empty src_test and Paludis would show a recommended_tests option that would do nothing.
What we want from a replacement:
Know in advance if/which tests will be run.
Allow running specific phases without userpriv. Maybe even have e.g. two test phases. One with userpriv and one without.
We would like to clearly mark if a package has (no tests/tests/expensive tests/both/broken tests).
In case tests are broken, we would like to mark whose fault that is (upstream/downstream).
A replacement called CONTROL has been proposed here but wasn’t implenented so far. Example uses could be:
CONTROL="test [[ mode=broken privs=root ]]"
CONTROL="test [[ recommended=true expensive=true ]]"
It could also support annotations for other functions, e.g.
CONTROL="compile [[ parallel=false ]]"
Potential mentors: Bo “zlin” Ørsted Andresen, Ciaran “ciaranm” McCreesh
Difficulty: Medium
Prerequisites: Decent knowledge of C++, experience with refactoring code and writing unit tests.
Use the Paludis API (probably the Ruby or Python bindings, rather than the C++ API) to extract information about available packages, and present it in the form of a website.
We currently have “SUMMER”, a “Statically Updated Metadata Manifestation for Exherbo Repositories”. It’s a crude ruby script accessible from http://summer.exherbolinux.org .
The wishlist of feature can be found in the README
Potential mentors: Bo “zlin” Ørsted Andresen, Ciaran “ciaranm” McCreesh
Difficulty: Medium
Prerequisites: Familiarity with HTML, CSS and probably CGI, as well as one of Ruby, Python or C++.
A client/server based application collecting and displaying a variety of statistics about package use on client machines.
The client part should gather information using the paludis library bindings and submit them to the server. The server part is responsible for receiving all the data and providing a web interface where users can show statistics from the collected data, filtering and sorting the data as needed.
Security is an important part of this project as all collected data must remain absolutely anonymous. If data is leaked that can identify either persons or machines this can harm our users as well as the Exherbo project.
The student is well advised to look at Debian Popularity Contest and Ubuntu Popularity Contest for existing implementations of this idea and inspiration.
This project is easiest to implement in Ruby or Python but can be implemented in C++ or a combination of these languages. The project doesn’t require too much prior knowledge of Paludis and should be reasonably easy.
Potential mentors: Bo “zlin” Ørsted Andresen, Ciaran “ciaranm” McCreesh
Difficulty: Easy
Prerequisites: Familiarity with Ruby, Python or C++. Network protocols, client/server communication
Currently we have a tool called Sydbox to do basic sandboxing which aims to detect misbehaving builds. Being a very simple tool, Sydbox doesn’t allow programmatic access to its internals. Due to this reason, a new project called PinkTrace has been started.
PinkTrace is a ptrace()
wrapper library which aims to make writing tracing programs easy. ptrace()
is very OS and architecture dependent. Writing portable tracing programs requires a tremendous amount of work.
Currently PinkTrace has an unstable API and ABI. There are many parts that need improving. The main plan is to write a callback-driven higher-level library called pinktrace-easy on top of PinkTrace. Some preliminary work has started in easy branch.
Potential mentors: Ali “alip” Polatel
Difficulty: Medium
Prerequisites: Familiarity with C and the C99 standard
In 2011, we’ve implemented native “multilib” support for the i686 (aka x86) and x86_64 (aka amd64) architectures.
While this works nicely, there’s lots of room for improvements with respect to the existing supported arches as well as adding support for additional arches. Some possible changes are discussed here:
http://lists.exherbo.org/pipermail/exherbo-dev/2011-December/001012.html
Potential mentors: David “dleverton” Leverton, Bo “zlin” Ørsted Andresen
Difficulty: Medium to high
Prerequisites: Decent knowledge of the GNU/Linux (development) toolchain (gcc, glibc, binutils, etc.), some C++ knowledge would be beneficial for the package manager side of this project.
Currently, when a shared library breaks binary compatibility by changing its SONAME
, the user has to run cave fix-linkage
afterwards to find and rebuild the affected packages. It would be better if this could be done automatically when the library is upgraded; even better would be to allow the old version of the library to stay installed until the dependent packages are upgraded, but without colliding with the new one over headers, static libraries, etc. There’s some discussion of the first part at:
http://lists.exherbo.org/pipermail/exherbo-dev/2011-March/000889.html
For the second, an initial attempt might be to have a different SLOT
for each SONAME
, and when a new SLOT
is released, add a new revision to the old one that only includes the library itself. More sophisticated would be to avoid the need to create the new revision explicitly and make the package manager somehow know that old versions only need the library. The yet-unimplemented “parts” concept, which involves classifying the installed files as libraries, headers, data, etc, would probably be useful here.
Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton
Difficulty: Medium to high
Prerequisites: Decent knowledge of C++.
Copyright 2009-2012 Bryan Østergaard, Ciaran McCreesh, Fernando J. Pereda and Wulf C. Krueger