A Survey of Software-Based String Matching Algorithms for Forensic Analysis

Yi-Ching Liao

Abstract


Employing a fast string matching algorithm is essential for minimizing the overhead of extracting structured files from a raw disk image. In this paper, we summarize the concept, implementation, and main features of ten software-based string matching algorithms, and evaluate their applicability for forensic analysis. We provide comparisons between the selected software-based string matching algorithms from the perspective of forensic analysis by conducting their performance evaluation for file carving. According to the experimental results, the Shift-Or algorithm (R. Baeza-Yates & Gonnet, 1992) and the Karp-Rabin algorithm (Karp & Rabin, 1987) have the minimized search time for identifying the locations of specified headers and footers in the target disk.

Keywords


string matching algorithm, forensic analysis, file carving, Scalpel, data recovery

Full Text:

PDF

References


AbuHmed, T., Mohaisen, A., & Nyang, D. (2007, November). A survey on deep packet inspection for intrusion detection systems. Magazine of Korea Telecommunication Society, 24, 25–36. arXiv:0803.0037

Aho, A. V., & Corasick, M. J. (1975, June). Efficient string matching: an aid to bibliographic search. Commun. ACM, 18(6), 333–340. doi: 10.1145/360825360855

Aoe, J.-i. (1994). Computer algorithms: string pattern matching strategies (Vol. 55). Wiley. com.

Baeza-Yates, R., & Gonnet, G. H. (1992, October). A new approach to text searching. Commun. ACM, 35(10), 74–82. doi: 101145/135239.135243

Baeza-Yates, R. A. (1989, April). Algorithms for string searching. SIGIR Forum, 23(3-4), 34–58. doi: 10.1145/74697.74700

Berry, T., & Ravindran, S. (1999). A fast string matching algorithm and experimental results. In Stringology (pp. 16–28).

Boyer, R. S., & Moore, J. S. (1977, October). A fast string searching algorithm. Commun. ACM, 20(10), 762–772. doi: 10.1145/359842.359859

Charras, C., & Lecroq, T. (2004). Handbook of exact string matching algorithms. King’s College Publications.

Coit, C., Staniford, S., & McAlerney, J. (2001). Towards faster string matching for intrusion detection or exceeding the speed of snort. In DARPA information survivability conference amp; exposition II, 2001. DISCEX ’01. proceedings (Vol. 1, pp. 367– 373 vol.1). doi: 10.1109/DISCEX.2001.932231

Commentz-Walter, B. (1979). A string matching algorithm fast on the average.

Automata, Languages and Programming, 6th Colloquium, 71, 118–132.

Horspool, R. N. (1980). Practical fast searching in strings. Software: Practice and Experience, 10(6), 501– 506. doi: 10.1002/spe.4380100608

Karp, R. M., & Rabin, M. (1987). Efficient randomized pattern-matching algorithms. IBM Journal of Research and Development, 31(2), 249–260. doi: 10.1147/ rd.312.0249

Knuth, D. E., Morris, J., & Pratt, V. R. (1977). Fast pattern matching in strings. SIAM journal on computing, 6(2), 323–350. doi:10.1137/0206024

Navarro, G. (2001, March). A guided tour to approximate string matching. ACM Comput. Surv., 33(1), 31–88. doi: 10.1145/ 375360.375365

Nick Mikus. (2005a, March). Digital forensics tool testing image, available at http://dftt.sourceforge.net/test11/index.html.

Nick Mikus. (2005b, March). Digital forensics tool testing image, available at http://dftt.sourceforge.net/test12/index.html.

Pungila, C. (2012). Improved file-carving through data-parallel pattern matching for data forensics. In 2012 7th IEEE international symposium on applied computational intelligence and informatics (SACI) (pp. 197–202). doi: 10.1109/SACI.2012.6250001

Raita, T. (1992). Tuning the boyer-moore-horspool string searching algorithm. Software: Practice and Experience, 22(10), 879–884. doi: 10.1002/spe.4380221006

Richard III, G. G., & Roussev, V. (2005). Scalpel: A frugal, high performance file carver. Refereed Proceedings of the 5th Annual Digital Forensic Research Workshop. Retrieved 2014-12-12, from http://www.dfrws.org/2005/proceedings/richard_scalpel.pdf

Smith, P. D. (1991). Experiments with a very fast substring search algorithm. Software: Practice and Experience, 21(10), 1065–1074. doi: 10.1002/spe.4380211006

Sunday, D. M. (1990, August). A very fast substring search algorithm. Commun. ACM, 33(8), 132–142. doi: 10.1145/79173.79184

Tuck, N., Sherwood, T., Calder, B., & Varghese, G. (2004). Deterministic memory

efficient string matching algorithms for intrusion detection. In INFOCOM 2004. twenty-third Annual Joint conference of the IEEE computer and communications societies (Vol. 4, pp. 2628–2639 vol.4). doi: 10.1109/INFCOM.2004.1354682

Wu, S., & Manber, U. (1994). A fast algorithm for multi-pattern searching (Tech. Rep.). Technical Report TR-94-17, University of Arizona. Retrieved 2014-12-12, from http://webglimpse.net/pubs/TR94-17.pdf


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

(c) 2006-2015 Association of Digital Forensics, Security and Law