Setup Your Own LXR

On February 20, 2014, in How To, Linux, by erik

By: Alan Deutscher

An extremely useful way of viewing and indexing source code is with an LXR system. The setup is however a bit tricky so here is a guide!

Experimenting with the LXRng code indexer. LXRng is a utility used to index C code. Originally made for the Linux kernel, it is also reusable with pretty much any C project.

Environment

CentOS 6.5

Files/Packages

Browse through the repository to find the right path if you’re using a different CentOS version/architecture combination.

Repositories (CentOS 6, 64-bit)

Installed EPEL and RPMForge repository.

rpm -ivh http://fedora-epel.mirror.iweb.com/6/x86_64/epel-release-6-8.noarch.rpm
rpm -ivh http://pkgs.repoforge.org/rpmforge-release/rpmforge-release-0.5.3-1.el6.rf.x86_64.rpm

Installed Xapian repositories:

rpm -ivh http://rpm.eprints.org/rpm-eprints-org-key-1-1.noarch.rpm
rpm -ivh http://rpm.eprints.org/xapian/6/noarch/rpm-eprints-org-xapian-6-1.noarch.rpm

Packages

Installed packages for environment.

yum install git inkscape postgresql postgresql-server ctags perl-CGI-Ajax
perl-CGI-Simple perl-DBD-Pg perl-HTML-Parser perl-Template-Toolkit 
perl-Term-ProgressBar perl-Term-Size perl-TermReadKey perl-Devel-Size 
gcc gcc-c++ perl-ExtUtils-MakeMaker perl-CPAN perl-YAML libpng libpng-devel
texinfo perl-Digest-SHA1 wget httpd make icoutils xapian-core gifsicle 
xapian-bindings-{php,python,tcl8,perl,ruby} ncftp xapian-core-devel

Note:One last double-check for the Xapian repositories. If you do not have the Xapian repositories, you will not be able to install a version of the following packages:

xapian-bindings-php
xapian-bindings-tcl8
xapian-bindings-perl

LXRng will try to run ctags as ctags-exuberant (this is the name of the command in Debian-based distributions), so we will make a symbolic link for it.

ln -s /usr/bin/ctags /usr/bin/ctags-exuberant

Perl Modules

Install additional Perl modules (Many of these seem to be redundant after the above {{Command|yum}} installation:

perl -MCPAN -e 'install Search::Xapian'
perl -MCPAN -e 'install Class::Accessor'
perl -MCPAN -e 'install Class::MethodMaker'
perl -MCPAN -e 'install Digest::SHA'
perl -MCPAN -e 'install HTML::Parser'
perl -MCPAN -e 'install IO::Stringy'
perl -MCPAN -e 'install PerlIO::gzip'
perl -MCPAN -e 'install Template'
perl -MCPAN -e 'install HTML::Entities'

Re-Installing Perl Modules

If for some reason you feel your Perl module installation wasn’t proper, you can uninstall it by using the following script to track down all of the files that were installed by the module:

#!/usr/bin/perl
 
use 5.010;
use ExtUtils::Installed qw();
use ExtUtils::Packlist qw();
 
die "Usage: $0 Module::Name Module::Name\n" unless @ARGV;
 
for my $mod (@ARGV) {
    my $inst = ExtUtils::Installed->new;
 
    foreach my $item (sort($inst->files($mod))) {
        say "removing $item";
        unlink $item or warn "could not remove $item: $!\n";
    }
 
    my $packfile = $inst->packlist($mod)->packlist_file;
    print "removing $packfile\n";
    unlink $packfile or warn "could not remove $packfile: $!\n";
}

LXR Files and Database Configuration

Clone the Git repository for LXR:

git clone git://lxr.linux.no/git/lxrng.git /opt/lxrng

Navigate in and copy the existing example config files.

cd /opt/lxrng
cp lxrng.conf-dist lxrng.conf
cp apache2-site.conf-dist-cgi apache2-site.conf

Make sure that the Postgres service is initialized and running (The stage with initdb is only necessary on CentOS 6):

chkconfig postgresql on
service postgresql initdb
service postgresql start

Become the ”’postgres”’ user to manipulate our Postgres database:

su - postgres

You can create databases for your indexing projects using the “createdb” command:

createdb lxrng

Create a user for the Apache service to use, corresponding with the name of the user that Apache is running under.

createdb lxrng
createuser apache
> Shall the new role be a superuser? (y/n) n
> Shall the new role be allowed to create databases? (y/n) n
> Shall the new role be allowed to create more new roles? (y/n) n

Create another user.

createuser lxr
> Shall the new role be a superuser? (y/n) n
> Shall the new role be allowed to create databases? (y/n) n
> Shall the new role be allowed to create more new roles? (y/n) n

And another still!

createuser lxradmin
> Shall the new role be a superuser? (y/n) n
> Shall the new role be allowed to create databases? (y/n) y
> Shall the new role be allowed to create more new roles? (y/n) n

Become ”’root”’ again and alter a few permissions:

mkdir /var/lib/lxrng/text-db -p
mkdir /var/lib/lxrng/cache
chmod a+rw /var/lib/lxrng
chmod a+rw /var/lib/lxrng/cache/
chmod a+rw /var/lib/lxrng/text-db/

Also create a directory for repositories wherever you want to. For an easily-to-type location, I opted for Path “/srv/lxrng/repos/”.

While this is fine for a closed-off box, you may wish to consider a more careful approach for a box that people have access to. Note that both the postgres and apache users need to be able to read all of these files.

A user requested the ability to also index C++ code, so I made a modification to the LXRng files. From the root of your LXRng setup, open “lib/LXRng/Lang/C.pm” for editing. Find the ”’pathexp”’ method, which contains the RegEx for C file extensions. In its original state, it should look like this:

sub pathexp {
    return qr/.[ch]$/;
}

Modify the content so that it looks like this:

sub pathexp {
    return qr/.[ch]$|.cpp$/;
}

If you have any C++ projects that you have already indexed, you will need to re-index them.

Apache Configuration


cp /opt/lxrng/apache2-site.conf-dist-cgi /opt/lxrng/apache2-site.conf
sed -i 's:@@LXRURL@@::' /opt/lxrng/apache2-site.conf
sed -i 's:@@LXRROOT@@:/opt/lxrng:' /opt/lxrng/apache2-site.conf

In “/opt/lxrng/apache2-site.conf”, make the following manual change:

# For LXRng installed directly in the web site root, use
ScriptAlias / "/opt/lxrng/webroot/lxr.cgi/"
# otherwise use (no trailing slash):
# ScriptAlias / "/opt/lxrng/webroot/lxr.cgi"

If your site is jumping right to a 404 error page, odds are you mixed these two up!

Create a symbolic link to Apache’s “conf.d/” folder:

ln -s /opt/lxrng/apache2-site.conf /etc/httpd/conf.d/

Git Configuration

The default configuration for LXRng (configured for the Linux kernel) looks like this:

# Configuration file
#
#
 
use LXRng::Index::PgBatch;
use LXRng::Repo::Git;
use LXRng::Search::Xapian;
 
my $gitrepo = LXRng::Repo::Git
    ->new('/var/lib/lxrng/repos/linux-2.6/.git',
          release_re => qr/^v[^-]*$/,
          author_timestamp => 0);
 
my $index   = LXRng::Index::PgBatch->new(db_spec => 'dbname=lxrng;port=5432',
                                         db_user => "", db_pass => "",
                                         # table_prefix => 'lxr'
                                         );
my $search  = LXRng::Search::Xapian->new('/var/lib/lxrng/text-db/linux-2.6');
 
return {
    'linux' => {
        'repository'  => $gitrepo,
        'index'       => $index,
        'search'      => $search,
 
        'base_url'    => 'http://localhost/lxr',
        # Must be writable by httpd user:
        'cache'       => '/var/lib/lxrng/cache',
 
        'fs_charset'  => 'iso-8859-1',
        # Tried successively
        'content_charset' => ['utf-8', 'iso-8859-1'],
 
        'languages'   => ['C', 'GnuAsm', 'Kconfig'],
        'ver_list'    => [$gitrepo->allversions],
 
        'ver_default' => 'v2.6.20.3',
 
        'include_maps' =>
            [
             [qr|^arch/(.*?)/|, qr|^asm/(.*)|,
              sub { "include/asm-$_[0]/$_[1]" }],
             [qr|^include/asm-(.*?)/|, qr|^asm/(.*)|,
              sub { "include/asm-$_[0]/$_[1]" }],
             [qr|^|, qr|^asm/(.*)|,
              sub { map { "include/asm-$_/$_[0]" }
                    qw(i386 alpha arm ia64 m68k mips mips64),
                    qw(ppc s390 sh sparc sparc64 x86_64) }],
             [qr|^|, qr|(.*)|,
              sub { "include/$_[0]" }],
             ],
    },
};

Plain Repositories

My go-to testing repository was OpenVPN. Most of the examples use the Linux kernel, for obvious reasons. But the code base is enormous, and its much easier to cycle between troubleshoots with a smaller project. OpenVPN stores its configuration in Git, but it can just as easily be used as an example for Plain storage.

Versions

As they do not have metadata like Git, plain repositories simply separate different versions into sub-directories of the repository root.

For example, lets say that a user wanted to index the plain repository housed at “/var/lib/lxrng/repos/openvpn-plain”.

my $openvpnRepo = LXRng::Repo::Plain->new('/var/lib/lxrng/repos/openvpn-plain');

This directory has two folders, “2.2/” and “2.3/”. When the project is indexed, these directory names will be read as our versions.

My end use-case was a project stored in an SVN repository, so I would simply do a check-out into each of the folders from the appropriate revision.

svn co http://code.network.local/trunk/project/@123 /var/lib/lxrng/repos/project/v1.0/
svn co http://code.network.local/trunk/project/@321 /var/lib/lxrng/repos/project/v2.0/

Configuration

In “lxrng.conf”, setting up a repository:

# -*- mode: perl -*-
# Configuration file
#
#
 
use LXRng::Index::PgBatch;
use LXRng::Repo::Git;
use LXRng::Repo::Plain;
use LXRng::Search::Xapian;
 
my $openvpnIndex = LXRng::Index::PgBatch->new(db_spec => 'dbname=openvpn;port=5432',
                     db_user => "", db_pass => "",
                     );
 
my $openvpnRepo = LXRng::Repo::Plain->new('/var/lib/lxrng/repos/openvpn-plain');
 
my $openvpnSearch = LXRng::Search::Xapian->new('/var/lib/lxrng/text-db/openvpn-plain');
 
return {
  'openvpn' => {
    'repository'=>$openvpnRepo,
    'index'=>$openvpnIndex,
    'search'=>$openvpnSearch,
    'base_url'=>'http://192.168.1.5/',
    'fs_charset'=>'iso-8859-1',
    'content_charset' => ['utf-8','iso-8859-1'],
    'languages' => ['C', 'GnuAsm', 'Kconfig'],
    'ver_list' => [$openvpnRepo->allversions],
  },
};

Checking out the repository.

git clone https://github.com/OpenVPN/openvpn.git /var/lib/lxrng/text-db/openvpn

Become the ”’postgres”’ user again.

su - postgres

Create the database for the repository:

psql
 
postgres=# CREATE DATABASE openvpn WITH OWNER=apache;
CREATE DATABASE

Initialize the database:

/opt/lxr-db-admin openvpn --init

Troubleshooting

While some of these errors can be worked around, they are most likely caused by a configuration error. Items without workarounds are items after I scrapped my configuration and started from scratch.

Can’t use string as an ARRAY ref

When I ran:

lxr-db-admin openvpn --init

I was given the following errors:

Use of uninitialized value in array element at /usr/local/lib64/perl5/Search/Xapian.pm line 115, <$cfgfile> line 56.
Use of uninitialized value in array element at /usr/local/lib64/perl5/Search/Xapian.pm line 120, <$cfgfile> line 56.
Use of uninitialized value in array element at /usr/local/lib64/perl5/Search/Xapian.pm line 125, <$cfgfile> line 56.
Use of uninitialized value in array element at /usr/local/lib64/perl5/Search/Xapian.pm line 130, <$cfgfile> line 56.
Can't use string ("/opt/lxraz") as an ARRAY ref while "strict refs" in use at configuration file line 44, <$cfgfile> line 56.

There are a lot more “uninitialized value” warnings. Printed one instance of each.

The problem causing the error on the final line was being caused by this snippet of code in the “lib/LXRng/Context.pm” file:

my @config = eval("use strict; use warnings;\n".
                  "#line 1 \"configuration file\"\n".
                  join("", <$cfgfile>));
...

We do not want to use the “refs” section of strict.
Solved this by disabling strict “refs”, and making sure that the feature was not re-enabled when the config file was being loaded:

...
no strict "refs";
my @config = eval("use warnings;\n".
                  "#line 1 \"configuration file\"\n".
                  join("", <$cfgfile>));
...

Not a HASH reference

Immediately after solving the above ARRAY error, encountered this:

Uncaught exception from user code:
        Not a HASH reference at /opt/lxrng/lib/LXRng/Context.pm line 100.

The solution to this error ended up being a hair-tearingly simple issue. I had defined my repository as an array instead of a hash.

...
'openvpn' => [
    'repository'=>$openvpnRepo,
    'index'=>$openvpnIndex,
    'search'=>$openvpnSearch,
    'base_url'=>'http://192.168.1.5/',
    'fs_charset'=>'iso-8859-1',
    'content_charset' => ['utf-8','iso-8859-1'],
    'languages' => ['C', 'GnuAsm', 'Kconfig', 'Generic' ],
    'ver_list' => [$openvpnRepo->allversions],
  ],
...

By using the square brackets, I declared my variable as an array, as opposed to the hash that I wanted it to be. Perl will ”’not”’ complain about the odd syntax of using ”’=>”’ as the separator for every other index, and simply index the items after them as an array.

...
'openvpn' => {
    'repository'=>$openvpnRepo,
    'index'=>$openvpnIndex,
    'search'=>$openvpnSearch,
    'base_url'=>'http://192.168.1.5/',
    'fs_charset'=>'iso-8859-1',
    'content_charset' => ['utf-8','iso-8859-1'],
    'languages' => ['C', 'GnuAsm', 'Kconfig'],
    'ver_list' => [$openvpnRepo->allversions],
  },
...

Refreshingly, this did not alternately correct the other errors I’d solved up to this point.

Can’t Call Method

Received the following error:

Can't call method "contents" on an undefined value at /opt/lxrng/lib/LXRng/Repo/Plain/Iterator.pm line 28.

This was due to a misunderstanding about how plain folders were indexed.

Plain folders do “versions” by having different folders in the repository path named for the versions. Without the sub-directory to act as a version, this error will appear.

Cannot Index – Cannot Locate /Search/Xapian/Document/new1.al

Received the following error:

Can't locate auto/Search/Xapian/Document/new1.al in @INC (@INC contains: /opt/lxrng/lib /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /usr/local/lib64/perl5/Search/Xapian/Document.pm line 31

There is a “Search/Xapian/Document.pm” in my Perl path, but no matching directory.

Solution still pending. This currently seems to be a configuration error, as the problem does not come up with a barely-modified version of the config file.

Cannot Index – Cannot Locate auto/Search/Xapian/DB_CREATE_O.al

When working off of a barely-modified version of the config file, received an error. With diagnostics enabled, received the following stack trace when trying to index a Git project:

&gt;
Use of inherited AUTOLOAD for non-method Search::Xapian::DB_CREATE_OR_OPEN() is deprecated at /opt/lxrng/lib/LXRng/Search/Xapian.pm line 52.
Indexing: (nothing to do)
Can't locate auto/Search/Xapian/DB_CREATE_O.al in @INC (@INC contains:
        /opt/lxrng/lib /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /opt/lxrng/lib/LXRng/Search/Xapian.pm line 52 (#5)
Uncaught exception from user code:
        Can't locate auto/Search/Xapian/DB_CREATE_O.al in @INC (@INC contains: /opt/lxrng/lib /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at /opt/lxrng/lib/LXRng/Search/Xapian.pm line 52
 at /usr/share/perl5/AutoLoader.pm line 47
        AutoLoader::AUTOLOAD() called at /opt/lxrng/lib/LXRng/Search/Xapian.pm line 52
        LXRng::Search::Xapian::wrdb('LXRng::Search::Xapian=HASH(0x26ae9b8)') called at /opt/lxrng/lib/LXRng/Search/Xapian.pm line 145
        LXRng::Search::Xapian::flush('LXRng::Search::Xapian=HASH(0x26ae9b8)') called at ./lxr-genxref line 441

The code in “lib/LXRng/Search/Xapian.pm” on lines 52/53 is:

return $$self{'wrdb'} = Search::Xapian::WritableDatabase
    ->new($$self{'db_root'}, Search::Xapian::DB_CREATE_OR_OPEN);

References

Tagged with:  

Command Line dictionary at your fingertips

On January 14, 2014, in How To, Linux, by erik

I do a lot of my work via the command line. So naturally I thought it would be useful to also have a dictionary at my fingertips as well. Either for a words meaning or the correct spelling of a word. There is a cool program I ran into called ‘sdcv’ or Stardict. To install it is quite easy on a Debian based system:

Install the dictionary

Run the following command in the terminal:

sudo apt-get install sdcv

Download dictionary files

Download the dictionary files according to your requirements from the following sources.

http://abloz.com/huzheng/stardict-dic/dict.org/

Install downloaded dictionaries

Make the directory where sdcv looks for the dictionary:

sudo mkdir -p /usr/share/stardict/dic/

The next command depends on whether the downloaded file is a .gz file or a .bz2 file.

If it is a .bz2 file:

sudo tar -xvjf downloaded.tar.bz2 -C /usr/share/stardict/dic

If it is a .gz file:

sudo tar -xvzf downlaoded.tar.gz -C /usr/share/stardict/dic

To search for a word use:

sdcv word

Create an Alias

I found that typing ‘sdcv word’ because very tedious. So I ended up setting up an alias for sdcv as word. This means I can now easily just put ‘word hello’. Then the program merely goes of and find the word for me. Who needs a browser anyways!