Base class for precomputed factories that create
MultiSearchProcessor
s.
The purpose of
MultiSearchProcessor
is to perform efficient simultaneous search for multiple
needles
in the
haystack
, while scanning every byte of the input sequentially, only once. While it can also be used
to search for just a single
needle
, using a
SearchProcessorFactory
would be more efficient for
doing that.
See the documentation of
AbstractSearchProcessorFactory
for a comprehensive description of common usage.
In addition to the functionality provided by
SearchProcessor
,
MultiSearchProcessor
adds
a method to get the index of the
needle
found at the current position of the
MultiSearchProcessor
-
MultiSearchProcessor.getFoundNeedleId()
.
Note: in some cases one
needle
can be a suffix of another
needle
, eg.
{"BC", "ABC"}
,
and there can potentially be multiple
needles
found ending at the same position of the
haystack
.
In such case
MultiSearchProcessor.getFoundNeedleId()
returns the index of the longest matching
needle
in the array of
needles
.
Usage example (given that the
haystack
is a
ByteBuf
containing "ABCD" and the
needles
are "AB", "BC" and "CD"):
MultiSearchProcessorFactory factory = MultiSearchProcessorFactory.newAhoCorasicSearchProcessorFactory(
"AB".getBytes(CharsetUtil.UTF_8), "BC".getBytes(CharsetUtil.UTF_8), "CD".getBytes(CharsetUtil.UTF_8));
MultiSearchProcessor processor = factory.newSearchProcessor();
int idx1 = haystack.forEachByte(processor);
// idx1 is 1 (index of the last character of the occurrence of "AB" in the haystack)
// processor.getFoundNeedleId() is 0 (index of "AB" in needles[])
int continueFrom1 = idx1 + 1;
// continue the search starting from the next character
int idx2 = haystack.forEachByte(continueFrom1, haystack.readableBytes() - continueFrom1, processor);
// idx2 is 2 (index of the last character of the occurrence of "BC" in the haystack)
// processor.getFoundNeedleId() is 1 (index of "BC" in needles[])
int continueFrom2 = idx2 + 1;
int idx3 = haystack.forEachByte(continueFrom2, haystack.readableBytes() - continueFrom2, processor);
// idx3 is 3 (index of the last character of the occurrence of "CD" in the haystack)
// processor.getFoundNeedleId() is 2 (index of "CD" in needles[])
int continueFrom3 = idx3 + 1;
int idx4 = haystack.forEachByte(continueFrom3, haystack.readableBytes() - continueFrom3, processor);
// idx4 is -1 (no more occurrences of any of the needles)
// This search session is complete, processor should be discarded.
// To search for the same needles again, reuse the same AbstractMultiSearchProcessorFactory
// to get a new MultiSearchProcessor.