This is an overview of some of Zebra's most important features:
Very large databases: files for indexes, etc. can be
automatically partitioned over multiple disks.
Arbitrarily complex records. The internal data format
is an structured format conceptually similar to XML or GRS-1,
which allows lists, nested structured data elements and
variant forms of data.
Robust updating - records can be added and deleted ``on the fly''
without rebuilding the index from scratch.
Records can be safely updated even while users are accessing
the server.
The update procedure is tolerant to crashes or hard interrupts
during database updating - data can be reconstructed following
a crash.
Configurable to understand many input formats.
A system of input filters driven by
regular expressions allows most ASCII-based
data formats to be easily processed.
SGML, XML, ISO2709 (MARC), and raw text are also
supported.
Searching supports a powerful combination of boolean queries as
well as relevance-ranking (free-text) queries. Truncation,
masking, full regular expression matching and "approximate
matching" (eg. spelling mistakes) are all handled.
Index-only databases: data can be, and usually is, imported
into Zebra's own storage, but Zebra can also refer to
external files, building and maintaining indexes of "live"
collections.
Protocol facilities: Init, Search, Present (retrieval),
Segmentation (support for very large records), Delete, Scan
(index browsing), Sort, Close and support for the ``update''
Extended Service to add or replace an existing XML record.
Piggy-backed presents are honored in the search request - that
is, a subset of the found records can be returned directly with
a search response, enabling search and retrieval to happen in a
single round-trip.
Named result sets are supported.
Easily configured to support different application profiles, with
tables for attribute sets, tag sets, and abstract syntaxes.
Additional tables control facilities such as element mappings to
different schema (eg., GILS-to-USMARC).
Complex composition specifications using Espec-1 (partial support).
Element sets are defined using the Espec-1 capability,
and are specified in configuration files as simple element
requests (and, optionally, variant requests).
Multiple record syntaxes
for data retrieval: GRS-1, SUTRS,
XML, ISO2709 (MARC), etc. Records can be mapped between record syntaxes
and schemas on the fly.