Counting Debian source packages

Following to Kushal’s post about counting total number of Debian packages, he concluded that sid currently has more than 30,000 binary packages (free, contrib & non-free).

IMHO it is more relevant to count source packages. I couldn’t find any existing way of doing it, I have hence written a short bash script.

Script updated thanks to Thomas’ advice – now checking source packages directly from the mirror’s Sources.gz file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#!/bin/sh
 
if [ $# -lt 1 ]; then
  echo "Please add at least one distribution as argument"
  echo "Exiting"
  exit 1
fi
 
for arg in $*
do
 
  echo "Number of source packages in $arg: "
  for dist in main contrib non-free; do
    echo -n "  $dist: "
    wget -q -O - ftp://ftp.debian.org/debian/dists/$arg/$dist/source/Sources.gz | zgrep -c '^Package: '
  done
 
done

The results as of today:

Number of source packages in etch:
  main: 10221
  contrib: 126
  non-free: 211
Number of source packages in lenny:
  main: 12176
  contrib: 180
  non-free: 241
Number of source packages in sid:
  main: 13032
  contrib: 158
  non-free: 275

Which means we are quite far from the 18,733+ available packages, proudly announced on Debian homepage.

Update: I had misread the figure advertised on Debian homepage which is a count of the binary packages available (which is currently much higher that what is stated there)

You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

12 Comments »

 
  • Anonymous says:

    You want grep-dctrl.

  • Thomas says:

    a) grep -c Package: /mirror/debian/dists/sid/main/source/Sources
    b) Packages in Debian == Packages in main, so 18733 binaries then compares with ~23800 now
    c) your way of getting sources from Packages: is completely WTF to the point that I wonder why you are a Debian Maintainer yet when you still have a lot to learn about where to get information about Debian packages
    (echo ; grep ‘^\(Source\|Package\|$\)’ /mirror/debian/dists/sid/main/binary-amd64/Packages ) | grep -A 1 ^$ | awk ‘/:/ { print $2 }’ | sort | uniq | wc -l

    These works with your /var/lib/apt-lists, too.

  • Julien says:

    Sorry, I cannot understand why you are so aggressive
    I just ask…
    If you doubt about my skills to be DM, please report to the appropriate people, I won’t take this bad and respect their choice!

    You are 100% right for a), just still wondering why I haven’t thought to it before – I first downloaded the results of allpackages to check the results of the previous post, which might explain this (not an excuse though)

    For b), re-reading the hompage indeed lead me to think it refers to binary packages – should however be updated to reflect the current state of packages.

    c) limit to the current architecture which I want to avoid

  • Thomas says:

    Sorry for being too aggressive,

    Here is what bugs me:

    a) your original post had the implication that “packages in Debian” on Debian’s main page could include contrib and non-free when you have advertised to have understood the social contract when applying for DM. That isn’t even a problem because everyone gets confused about something at times – I can probably tell up from down only about 50% of the time, but

    b) you’re putting your conclusion from this bad premise up on Planet Debian where everyone else will learn from it. That bugs me. That bugs me more in the light of things like Lucas’ thoughts about membership. I don’t have any reason to doubt your package-maintaining skills (and hey, there are way too many people whose package-maintaining skills I do doubt), but I thought your blog entry was problematic in the making-assertions-in-places-where-people-will-take-them-as-facts-before-understanding-department. I mean, clearly, either i) your script was wrong ii) the number on Debian’s web page was bogus iii) something else was going on. I would hate for headlines “Debian lies about the number of packages it contains” to be generated from your blog post.

    c) The suboptimal script, hey, OK. You put it on your blog, not in a package, I should just leave you alone for that.

    But you’re entirely right with useless aggression part. Sorry.

    Mind you, you still compare the Debian page’s binary count with your source count, maybe you could clarify your last sentence to reflect that your opinion is that the page should count something different instead of claiming the number is off the mark.

  • Thomas says:

    btw:

    zgrep ‘^Package: ‘ | sed ’s/Package: //’ | sort -u | wc -l

    so at least the sed is not needed and I should think that

    zgrep -c ‘^Package: ‘

    should do the trick entirely for repositories only featuring one version at the time (this should include dak ones at the moment).

  • Julien says:

    Reply to comment #4:
    a) No, when I have written “especially as these figures do include non-free and contrib packages which are not part of Debian” I wanted to say that the results of my script did include non-free and contrib, not the figure stated on the Debian homepage

    b) that was clearly not my attention – I just wanted to check the figure announced. From our discussion, I understand that the figure advertised is far below the truth, and I think it would be a good thing for the project to update this figure (especially as the current figure is much larger!)

    c) I didn’t want to change this part of the post, as I explain my error in my first answer to you (I will however update the post to reflect that)

    Reply to comment #5:
    Thank you, script updated again

  • Kirya [.net] says:

    Counting Debian source packages #2…

    Thanks to Joerg for pointing out pkg-nums.
    I was pretty sure such a tool already existed but could not find it.
    At least, my script (with Thomas’ amendments) allow distinctions between sections (free, contrib, non-free) And thanks to pkg-nums, I…

  • asdf says:

    Just a stupid detail: you’re missing “!” in your shebang and “bash” is unnecessary, just use “sh”.

    Anyway, an interesting fact I learned is how low the package count in Ubuntu actually is (just over 3000) compared to Debian. Ubuntu is for i386/amd64 while Debian Lenny is for i386/alpha/sparc/powerpc/arm/mips/mipsel/ia64/hppa/s390/amd64/armel. Granted, not all Debian packages run on all architectures but still I feel awe about Debian.

    I used the Ubuntu URL in your script.

    ftp://ftp.ubuntu.com/ubuntu/dists/$arg/$dist/source/Sources.gz

    • Julien says:

      Thanks, post updated!

    • Julien says:

      Here is a script for Ubuntu, using your URL. Arguments should be “jaunty”, “hardy” etc.

      #!/bin/sh
       
      if [ $# -lt 1 ]; then
        echo "Please add at least one distribution as argument"
        echo "Exiting"
        exit 1
      fi
       
      for arg in $*
      do
       
        echo -n "Number of source packages in $arg: "
        wget -q -O - ftp://ftp.ubuntu.com/ubuntu/dists/$arg/main/source/Sources.gz | zgrep -c '^Package: '
       
      done
      Number of source packages in karmic: 3059
      Number of source packages in jaunty: 3039
      Number of source packages in intrepid: 3199
      Number of source packages in hardy: 3114
  • asdf says:

    I wasn’t able to send a comment here from Opera 9.64. It worked from FF. You should fix that, thanks.

 

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">