Friday, April 1, 2016

pingest.pl Gets a New Option

In case you're using pingest.pl from MVLC's Evergreen utilities repository, then you might be interested to know that it got a new option this week:

--pipe
         Read record IDs to reingest from standard input.
         This option conflicts with --start-id and/or --end-id.


This new option allows you to run a custom query to feed record ids to pingest.pl. For instance, assuming you have a query that returns bibliographic record ids in a text file called query.sql, you could use a command line like the following to ingest the records corresponding to the ids returned from the query:

    psql -q -t -f query.sql | pingest.pl --pipe

In the absence of the --pipe option, pingest.pl continues to use its internal query to determine what records to ingest.

In case you are new here and don't know what all this record ingestion is about, this is Evergreen-speak for generating the indexes used for search, browse, facets, and record attributes. pingest.pl generates these indexes in parallel by splitting the records up into batches and working on more than one batch at a time. Parallel processing is usually faster than starting with one record and going straight through to the end.

Tuesday, February 16, 2016

pingest.pl Gets an Outside Contribution

Bill Erickson contributed a patch to pingest.pl that adds new command line options and cleans up the handling of the current options.

The command line options as of now are:

--batch-size
Number of records to process per batch
--max-child
Max number of worker processes
--skip-browse
--skip-attrs
--skip-search
--skip-facets
Skip the selected reingest component
--start-id
Start processing at this record ID.
--end-id
Stop processing when this record ID is reached
--max-duration
Stop processing after this many total seconds have passed.
--help
Show the help text and exit.

For those of you who may not know, pingest.pl is useful for reindexing records in your Evergreen database. It can reindex them in parallel thus reducing the time that it takes.

You can find pingest.pl as part of MVLC's Evergreen utilities git repository.

Tuesday, February 9, 2016

MVLC's Helper Functions Get an Update

If you have been using functions from MVLC's helperfuncs repo, you might be interested to know that they received an update today after nearly three-and-a-half years.

Today, I added checks for some of the optional parameters in the helper functions to create circ matrix and hold matrix matchpoints. Basically, in the cases where the functions have to look up an id in the database, they will raise an exception if that look up returns NULL.  This saves you from entering a matchpoint with a NULL value in one of these fields when you make a typo.

So, if you're using these functions, you might want to git pull and update them in your database.