A Survey of Software-Based String Matching Algorithms for Forensic Analysis

Yi-Ching Liao


Employing a fast string matching algorithm is essential for minimizing the overhead of extracting structured files from a raw disk image. In this paper, we summarize the concept, implementation, and main features of ten software-based string matching algorithms, and evaluate their applicability for forensic analysis. We provide comparisons between the selected software-based string matching algorithms from the perspective of forensic analysis by conducting their performance evaluation for file carving. According to the experimental results, the Shift-Or algorithm (R. Baeza-Yates & Gonnet, 1992) and the Karp-Rabin algorithm (Karp & Rabin, 1987) have the minimized search time for identifying the locations of specified headers and footers in the target disk.


string matching algorithm, forensic analysis, file carving, Scalpel, data recovery

Full Text:



AbuHmed, T., Mohaisen, A., & Nyang, D. (2007, November). A survey on deep packet inspection for intrusion detection systems. Magazine of Korea Telecommunication Society, 24, 25–36. arXiv:0803.0037

Aho, A. V., & Corasick, M. J. (1975, June). Efficient string matching: an aid to bibliographic search. Commun. ACM, 18(6), 333–340. doi: 10.1145/360825360855

Aoe, J.-i. (1994). Computer algorithms: string pattern matching strategies (Vol. 55). Wiley. com.

Baeza-Yates, R., & Gonnet, G. H. (1992, October). A new approach to text searching. Commun. ACM, 35(10), 74–82. doi: 101145/135239.135243

Baeza-Yates, R. A. (1989, April). Algorithms for string searching. SIGIR Forum, 23(3-4), 34–58. doi: 10.1145/74697.74700

Berry, T., & Ravindran, S. (1999). A fast string matching algorithm and experimental results. In Stringology (pp. 16–28).

Boyer, R. S., & Moore, J. S. (1977, October). A fast string searching algorithm. Commun. ACM, 20(10), 762–772. doi: 10.1145/359842.359859

Charras, C., & Lecroq, T. (2004). Handbook of exact string matching algorithms. King’s College Publications.

Coit, C., Staniford, S., & McAlerney, J. (2001). Towards faster string matching for intrusion detection or exceeding the speed of snort. In DARPA information survivability conference amp; exposition II, 2001. DISCEX ’01. proceedings (Vol. 1, pp. 367– 373 vol.1). doi: 10.1109/DISCEX.2001.932231

Commentz-Walter, B. (1979). A string matching algorithm fast on the average.

Automata, Languages and Programming, 6th Colloquium, 71, 118–132.

Horspool, R. N. (1980). Practical fast searching in strings. Software: Practice and Experience, 10(6), 501– 506. doi: 10.1002/spe.4380100608

Karp, R. M., & Rabin, M. (1987). Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development, 31(2), 249–260. doi: 10.1147/ rd.312.0249

Knuth, D. E., Morris, J., & Pratt, V. R. (1977). Fast pattern matching in strings. SIAM journal on computing, 6(2), 323–350. doi:10.1137/0206024

Navarro, G. (2001, March). A guided tour to approximate string matching. ACM Comput. Surv., 33(1), 31–88. doi: 10.1145/ 375360.375365

Nick Mikus. (2005a, March). Digital forensics tool testing image, available at http://dftt.sourceforge.net/test11/index.html.

Nick Mikus. (2005b, March). Digital forensics tool testing image, available at http://dftt.sourceforge.net/test12/index.html.

Pungila, C. (2012). Improved file-carving through data-parallel pattern matching for data forensics. In 2012 7th IEEE international symposium on applied computational intelligence and informatics (SACI) (pp. 197–202). doi: 10.1109/SACI.2012.6250001

Raita, T. (1992). Tuning the boyer-moore-horspool string searching algorithm. Software: Practice and Experience, 22(10), 879–884. doi: 10.1002/spe.4380221006

Richard III, G. G., & Roussev, V. (2005). Scalpel: A frugal, high performance file carver. Refereed Proceedings of the 5th Annual Digital Forensic Research Workshop. Retrieved 2014-12-12, from http://www.dfrws.org/2005/proceedings/richard_scalpel.pdf

Smith, P. D. (1991). Experiments with a very fast substring search algorithm. Software: Practice and Experience, 21(10), 1065–1074. doi: 10.1002/spe.4380211006

Sunday, D. M. (1990, August). A very fast substring search algorithm. Commun. ACM, 33(8), 132–142. doi: 10.1145/79173.79184

Tuck, N., Sherwood, T., Calder, B., & Varghese, G. (2004). Deterministic memory

efficient string matching algorithms for intrusion detection. In INFOCOM 2004. twenty-third Annual Joint conference of the IEEE computer and communications societies (Vol. 4, pp. 2628–2639 vol.4). doi: 10.1109/INFCOM.2004.1354682

Wu, S., & Manber, U. (1994). A fast algorithm for multi-pattern searching (Tech. Rep.). Technical Report TR-94-17, University of Arizona. Retrieved 2014-12-12, from http://webglimpse.net/pubs/TR94-17.pdf


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

(c) 2006-2015 Association of Digital Forensics, Security and Law