Mega Code Archive

 
Categories / Delphi / Files
 

Fast search of a string into a file

Title: Fast search of a string into a file Question: Search an string through a file. Something similar to a 'Grep'. Answer: Serve this trick like example of a quick search in a file using readings through a Buffer to hurry. In short, in this example it is to look for the first time that a string appears in a file (in case it appears, of course) indicating its position from the beginning of the file. It would be as making a search by means of a Pos(Substring, String), unless instead of looking for in a string, we will be able to read in a file of several gigas, but with the advantage of not having to load it suddenly in memory. To achieve it, we will go loading the file in a memory Buffer (that is of 8 Kbytes in the example), piece for piece. The process is like it continues: -We load a piece of 8 Kbytes and we look for inside the one. -If we find the string in that piece of 8 kbytes, we end up and we show where was found. -If the string was not found in that piece, we will repeat the process, that is to say, we will load a new piece and we look for again. All this is very well, but we leave ourselves a small detail: What happens if the string is located just between two pieces of those of 8 Kbytes?... because it would pass then our search would fail wretchedly:) To avoid it, we rewind the Stream a piece back fair before reading the following one. In short, we will rewind so many bytes like the longitude of the looked for chain, we make sure this way that we will find it although it plunders in a middle of two pieces. Easily it could be adapted for, for example, to count the times that it is that string inside the file, to substitute a string for other, to build your own command Grep, etc, etc... Here is the function and a call example is, everything it content in the OnClick of a TButton anyone: procedure TForm1.Button1Click(Sender: TObject); var EncontradaEn : integer; function BuscaStringEnFichero(const Fichero: string ;const Cadena: string):integer; { Busca la primera vez que la cadena 'Cadena' aparece dentro del fichero 'Fichero', devolviendo la posicin (Offset) en la que se encuentra (contando desde el principio del fichero) o bien devuelve un -1 si la cadena no fu encontrada. It looks for the first time that the string ' Cadena' appears inside the file ' Fichero', returning the position (Offset) in the one that is (counting from the beginning of the file) or it returns a -1 if the string was not find Radikal Q3 para Trucomania} const {Leeremos de 8K en 8K We will read of 8K in 8K } CUANTOBUFFER = 8192; var Corriente : TFileStream; Almacen : String; Donde : integer; Parar : boolean; Posicion : integer; begin SetLength(Almacen, CUANTOBUFFER); Corriente:=TFileStream.Create(Fichero,fmOpenRead OR fmShareDenyWrite); Result:=-1; try Corriente.Seek(0,soFromBeginning); Parar:=FALSE; repeat {Guardamos el inicio de lo leido, antes de leer We keep the beginning of that read, before reading } Posicion:=Corriente.Position; {Parar:=TRUE cuando no haya mas que leer o bien hayamos encontrado la cadena Parar(stop):=TRUE when there is not but to read or we have found the string } Parar:= ( Corriente.Read(Almacen[1],CUANTOBUFFER) CUANTOBUFFER ); {Buscamos la cadena en el Almacen leido We look for the string in the read Almacen } Donde:=Pos(Cadena, Almacen); If Donde 0 then begin Result:=Donde+Posicion; {Si la hemos encontrado... tambien paramos If we have found it... we also stopped } Parar:=TRUE; end else begin {Rebobinamos un poco por si la cadena estuviera en medio de dos pginas de CUANTOBUFFER de longitud: We rewind a little for if the string was in a middle of two pages of CUANTOBUFFER of longitude } Corriente.Seek(Length(Cadena),soFromCurrent); end; until Parar; finally Corriente.Free; end; end; begin {Ejemplo de uso Use example } {Ejecutamos la busqueda We execute the search } EncontradaEn:=BuscaStringEnFichero('c:\Ejemplo.txt','BuscaMe'); {Si la ladeca fu encontrada, mostramos donde, sino no If the string was find, we show where, but not } if EncontradaEn -1 then begin {Aqui si la encontr Here we just found it } ShowMessage( 'Cadena encontrada en: '+ // string found in: IntToStr( EncontradaEn ) ); end else begin ShowMessage( 'Lo siento, cadena no encontrada en el fichero'+#13+ 'Im sorry, string not found in the file'); end; end; Radikal