Suggested projects

As a source-based Linux distribution which relies heavily on a decentralised development model, we’re always interested in interesting new people to work on Exherbo. Any task, no matter how large or small can help you become recognised in the free/open source eco system, will offer you interesting insights into how we at Exherbo work but also how many other projects get things done and, last but not least, you’ll most likely have lots of fun with us. :-)

Below, you’ll find project suggestions that can get you started; be it for Google’s summer of code or eternal fame and glory! ;-)

Please contact Heiko “heirecka” Becker at heirecka@exherbo.org or Wulf “Philantrop” Krueger at philantrop@exherbo.org if you’re interested in any of these projects for further information. We can usually both be found in the #exherbo channel on Libera.Chat.

Native support for several external repository formats in Paludis

Paludis is a multi-format package manager. So far we’ve mostly made use of this to deal with ‘special’ repository types like unwritten and unavailable. It would be good to integrate native support for other external repository formats into Paludis and Exherbo:

Full package manager integration for Gems isn’t something that’s been done by anyone on a production scale before. The Gems people seem interested, though.

Attempts have been made at integrating Gems support into Paludis in the past. The idea has been shown to be implementable and sound; the main stumbling blocks have been:

Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton

Difficulty: Medium to high (depending on the type of repository)

Prerequisites: Decent knowledge of C++.

DISTFILES checksums/manifests

Since we’re using git for our exheres, patches, etc. we don’t really need checksum validation for those. It would be really nice to have checksums for DISTFILES, though. The project would be about figuring out how such checksums would be implemented (manifest files? They must not impair our git workflow, though.) and to actually develop the solution agreed upon in Paludis, our package manager.

The Paludis side of this shouldn’t be too hard because basically a case of saying “don’t create manifest entries for exhereses etc”. The problem is integrating it into the development workflow in a clean manner. This one’s doable without too much C++ knowledge.

Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen

Difficulty: Easy

Prerequisites: Some knowledge of C++, some real world git experience

Replacing CONFIG_PROTECT

Our current way to handle configuration files (primarily in /etc but generally any configuration) is based upon configuring paths in which files aren’t simply overwritten but into which the package manager installs newer/changed versions of existing files as dotfiles that have to be merged with existing configuration later on by the user. Getting a fresh, clean set of configuration without (unnecessarily) re-installing an entire package, smart configuration updates, exheres-supported configuration changes, etc. aren’t easily possible with this system. This project is about implementing a new configuration handling system. A good starting point for ideas is this email on our development mailinglist:

http://lists.exherbo.org/pipermail/exherbo-dev/2008-May/000137.html

Potential mentors: Ciaran “ciaranm” McCreesh

Difficulty: Medium

Prerequisites: Decent knowledge of the programming language to be used

Configuration file management

Our current tool for managing configuration files in /etc is rather limited and not exactly user-friendly. This project is about working out what a replacement should look like and implement it. The resulting tool should have pluggable backends and needs to work with at least git (with the option to use other CM tools if someone creates a plugin).

A good starting point for ideas is this email on our development mailinglist:

http://lists.exherbo.org/pipermail/exherbo-dev/2012-February/001040.html

Potential mentors: Bo “zlin” Ørsted Andresen, Wulf “Philantrop” Krueger, Ciaran “ciaranm” McCreesh

Difficulty: Medium

Prerequisites: Decent knowledge of the programming language to be used

Derestricted Version Formats

Version formats are currently limited to a fairly strict set of rules. These are mostly historical things we’ve inherited from Gentoo. For example, 1.2-3, 1.2B and 1.2-alpha3 are not valid versions, but 1.2.3, 1.2b and 1.2_alpha3 are; if upstreams use the former, packages have to jump through silly hoops to work around it. There’s no particular reason to keep these limitations.

This project would consist of:

Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen

Difficulty: Medium

Prerequisites: Applicants would need either a decent knowledge of C++, or a basic knowledge of C++ and a willingness to put up with grouchy programmers who like to yell “needs more unit tests!” and “not enough error checking!”.

Package categories

We’re using categories to group packages into (more or less) logical While this approach is fairly easy, it falls short in many respect (e. g. does Apache httpd belong to www-servers or net-www?). There have been several suggestions about how to get rid of categories and replace them with something more versatile, e. g. tags (like Freecode).

For this project, the following things need to be worked out:

The last two promising proposals were: http://lists.exherbo.org/pipermail/exherbo-dev/2009-January/000362.html and http://lists.exherbo.org/pipermail/exherbo-dev/2012-October/001162.html.

Potential mentors: Ciaran “ciaranm” McCreesh, Wulf “Philantrop” Krueger

Difficulty: Easy

Prerequisites: Some knowledge of C++ for package manager integration

REMOTE_IDS client

Develop a client using the Paludis library to extract metadata from Exherbo packages and check for new upstream versions. The client should be able to deliver the reports using several different interfaces. At minimum it should be able to generate a text based report locally as well as be able to mail the report and show it using a web interface.

Furthermore it should be possible to configure the client to generate reports for a subset of package repositories, package categories or specific packages.

For bonus points the client should also be able to show related data such as ChangeLogs or Release Notes when Exherbo packages contain enough metadata for that.

Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen

Difficulty: Medium to high (due to broad range potential data sources and formats)

Prerequisites: Decent knowledge of Ruby, Python or C++ and some knowledge of website development

Exherbo image builder

Build an application that can create image files useful for installing Exherbo or testing it.

The application should be able to build a number of different formats such as:

Users should be able to specify the partition layout, which packages to install in the image as well as any needed configuration for boot loaders, kernel configuration etc.. The application should work for both i686 and x86_64 based images and be designed in such a way that it’ll be possible to add further architecture support to it at a later point.

This project can be implemented in just about any language but the obvious choices would probably be scripting languages like Bash, Ruby or Python. Experience with partition and bootloaders would be an advantage but not a requirement.

Image builder Status

We’ve added a basic raw image building script called create-kvm-image written in bash. It works for most basic cases but needs a lot more features. It should be relatively easy to add most of these features to the script.

Features wanted

Potential mentors: Bryan “kloeri” Østergaard, Wulf “Philantrop” Krueger

Difficulty: Easy to medium (depending on the intended scope of the project)

Prerequisites: Decent knowledge of shell scripting, disk imaging tools

Quality assurance for packages/repositories

While we believe in our rigorous peer-review process which is working well, a tool that checks at least for the most basic quality issues in our exheres would be useful. We used to do this using a Paludis client (qualudis) which was based on a per-package model and had some fairly severe drawbacks.

Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen, Wulf “Philantrop” Krueger

Difficulty: Medium

Prerequisites: Experience in formal quality assurance, some knowledge of C++ for package manager integration

Interactive mode for package management

While we strongly believe in up-front configuration, there are usecases for some optional interactive behaviour of the package manager. This is especially true for the ‘cave resolve’ command that basically installs and uninstalls packages. Not only can it take the package manager (PM) quite some time to figure out larger resolutions but this might occur twice (e. g. when the user makes the PM first only pretend to do something and eventually lets it actually do it).

Furthermore, recommendations (taken automatically by default) and suggestions (displayed only by default) come into this as well. Having a “staging area” in the git sense would make this much more interesting. Thus, this project is about creating an interactive mode for ‘cave resolve’, based upon ‘git add -i’.

Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen

Difficulty: High

Prerequisites: Decent knowledge of C++, or a basic knowledge of C++ and a willingness to put up with grouchy programmers who like to yell “needs more unit tests!” and “not enough error checking!”.

New profiles implementation

Currently, our profiles can’t be altered from (third-party) repositories. This is especially unfortunate in the case of sub-options and the like which we have to add to (usually) arbor even though they’re needed in some third-party repository only.

Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton, Bo “zlin” Ørsted Andresen

Difficulty: Medium

Prerequisites: Decent knowledge of C++, experience with refactoring code and writing unit tests.

Replace stage tarballs with something smarter

Traditionally, we’ve been using so-called stage tarballs, containing essential pre-compiled packages (toolchain, package manager, etc.) to have an initial, minimal environment upon which to build a complete Exherbo installation. Now that we have working and properly documented binary packages (pbins) it’s now possible to replace these stages with something smarter. This one’s mostly about figuring out how to deal with things that’re currently done in pkg_* phases.

Potential mentors: Bo “zlin” Ørsted Andresen, Ciaran “ciaranm” McCreesh, Wulf “Philantrop” Krueger

Difficulty: Easy

Prerequisites: -

Building external kernel modules

Some software has its own external kernel modules (e. g. VirtualBox, VMWare, nvidia-drivers, etc.). Currently, we simply install their sources to /usr/src and tell the user to compile them. While this works, of course, this is something that could certainly be done more nicely with some support inside the distribution, especially since such modules may need to be built for several different kernel versions.

Potential mentors: Bo “zlin” Ørsted Andresen, Wulf “Philantrop” Krueger

Difficulty: Easy to medium (depending on the system to be chosen)

Prerequisites: -

CONTAINS/CONTAINED_IN

There’re some packages, like dev-perl/autodie, dev-perl/Module-Build, etc, which exist both as separate packages, and as part of core perl itself. To make things more confusing, not all versions of perl contain them. And you can’t even just do a simple “>=5.10” for some, since there’s 5.8.9 which came out later.

So, one solution that has been suggested is to add 2 new metadata keys, CONTAINS and CONTAINED_IN. They would each be space-separated lists of packages.

CONTAINS would be for dev-lang/perl, and would look something like:

dev-lang/perl/perl-5.10.1.exheres-0:

...
CONTAINS="dev-perl/Module-Build[=0.340201] ..."
...
# Not sure I like using the [=version] syntax here, but not sure of a
# better one.

Then, we would have a dev-perl/Module-Build like:

dev-perl/Module-Build/Module-Build-0.340201.exheres-0

...
CONTAINED_IN="dev-lang/perl[=5.10.1]"
...

Finally, a package that needs a particular version of Module-Build would simply dep upon it as if it were just a stand-alone package:

dev-perl/foo/foo-123.exheres-0

DEPENDENCIES="
    build:
        dev-perl/Module-Build[>=0.340201]
        ...
"

If perl 5.10.1 is already installed, and 0.340201 is the best available version of Module-Build around, then nothing will need to be pulled in. Otherwise, the newest Module-Build would be installed.

Potential mentors: David “dleverton” Leverton

Difficulty: Easy

Prerequisites: -

Replacing RESTRICT

We currently use RESTRICT as inherited from EAPI 0. RESTRICT=test is used the most, to prevent broken tests from running, but possible values are also userpriv and strip, default_src_test, tries to detect if there is a Makefile that has a check/test target and runs that if possible. When RESTRICT=test is set, the src_test phase isn’t run. There are a few downsides to this approach:

What we want from a replacement:

A replacement called CONTROL has been proposed here but wasn’t implenented so far. Example uses could be:

CONTROL="test [[ mode=broken privs=root ]]"
CONTROL="test [[ recommended=true expensive=true ]]"

It could also support annotations for other functions, e.g.

CONTROL="compile [[ parallel=false ]]"

Potential mentors: Bo “zlin” Ørsted Andresen, Ciaran “ciaranm” McCreesh

Difficulty: Medium

Prerequisites: Decent knowledge of C++, experience with refactoring code and writing unit tests.

Web Client

Use the Paludis API (probably the Ruby or Python bindings, rather than the C++ API) to extract information about available packages, and present it in the form of a website.

We currently have “SUMMER”, a “Statically Updated Metadata Manifestation for Exherbo Repositories”. It’s a crude ruby script accessible from http://summer.exherbolinux.org .

The wishlist of feature can be found in the README

Potential mentors: Bo “zlin” Ørsted Andresen, Ciaran “ciaranm” McCreesh

Difficulty: Medium

Prerequisites: Familiarity with HTML, CSS and probably CGI, as well as one of Ruby, Python or C++.

Exherbo package statistics

A client/server based application collecting and displaying a variety of statistics about package use on client machines.

The client part should gather information using the paludis library bindings and submit them to the server. The server part is responsible for receiving all the data and providing a web interface where users can show statistics from the collected data, filtering and sorting the data as needed.

Security is an important part of this project as all collected data must remain absolutely anonymous. If data is leaked that can identify either persons or machines this can harm our users as well as the Exherbo project.

The student is well advised to look at Debian Popularity Contest and Ubuntu Popularity Contest for existing implementations of this idea and inspiration.

This project is easiest to implement in Ruby or Python but can be implemented in C++ or a combination of these languages. The project doesn’t require too much prior knowledge of Paludis and should be reasonably easy.

Potential mentors: Bo “zlin” Ørsted Andresen, Ciaran “ciaranm” McCreesh

Difficulty: Easy

Prerequisites: Familiarity with Ruby, Python or C++. Network protocols, client/server communication

Improve sandboxing

Currently we have a tool called Sydbox to do basic sandboxing which aims to detect misbehaving builds. Being a very simple tool, Sydbox doesn’t allow programmatic access to its internals. Due to this reason, a new project called PinkTrace has been started.

PinkTrace is a ptrace() wrapper library which aims to make writing tracing programs easy. ptrace() is very OS and architecture dependent. Writing portable tracing programs requires a tremendous amount of work.

Currently PinkTrace has an unstable API and ABI. There are many parts that need improving. The main plan is to write a callback-driven higher-level library called pinktrace-easy on top of PinkTrace. Some preliminary work has started in easy branch.

Potential mentors: Ali “alip” Polatel

Difficulty: Medium

Prerequisites: Familiarity with C and the C99 standard

Multi-library / multi-build improvements

In 2011, we’ve implemented native “multilib” support for the i686 (aka x86) and x86_64 (aka amd64) architectures.

While this works nicely, there’s lots of room for improvements with respect to the existing supported arches as well as adding support for additional arches. Some possible changes are discussed here:

http://lists.exherbo.org/pipermail/exherbo-dev/2011-December/001012.html

Potential mentors: David “dleverton” Leverton, Bo “zlin” Ørsted Andresen

Difficulty: Medium to high

Prerequisites: Decent knowledge of the GNU/Linux (development) toolchain (gcc, glibc, binutils, etc.), some C++ knowledge would be beneficial for the package manager side of this project.

Better handling of library ABI breakages

Currently, when a shared library breaks binary compatibility by changing its SONAME, the user has to run cave fix-linkage afterwards to find and rebuild the affected packages. It would be better if this could be done automatically when the library is upgraded; even better would be to allow the old version of the library to stay installed until the dependent packages are upgraded, but without colliding with the new one over headers, static libraries, etc. There’s some discussion of the first part at:

http://lists.exherbo.org/pipermail/exherbo-dev/2011-March/000889.html

For the second, an initial attempt might be to have a different SLOT for each SONAME, and when a new SLOT is released, add a new revision to the old one that only includes the library itself. More sophisticated would be to avoid the need to create the new revision explicitly and make the package manager somehow know that old versions only need the library. The yet-unimplemented “parts” concept, which involves classifying the installed files as libraries, headers, data, etc, would probably be useful here.

Potential mentors: Ciaran “ciaranm” McCreesh, David “dleverton” Leverton

Difficulty: Medium to high

Prerequisites: Decent knowledge of C++.

Other smaller project ideas, not necessarily for Google Summer of Code


Copyright 2009-2012 Bryan Østergaard, Ciaran McCreesh, Fernando J. Pereda and Wulf C. Krueger