C++ Removing Duplicates

Hello guys, I'm trying to remove duplicate strings in a text file. I just need some guidance. The first is would it be better to use nested while loops?:
while(!data_input.eof)
I have tried using nested for loops such as
[HTML]
<P> for(int current_line=0;current_line<start_number_lines;current_line++)
<br>{
<br>get line from text
<br>for(int check_line=(current_line+1);current_line<start_number_lines;check_line++)
<br>{
<br>get line from text
<br>check if both are same
<br>}
<br>if they the same
<br>{
<br>data_output<<current_line;
<br>}
<br>} </P>[/HTML]
Also please ignore the <"P"> i couldn't get it to write the code down.
But nothing has been working for me. I don't think i should nest the while loops, and when i count number of lines using a function that runs through the lines of the text file, every where else the file is referenced for input, it starts at eof so im getting blank lines. I really know I have to use buffers, but not clear on how to call each line back from the buffer, wouldn't it be a lot of pointers?

Thanks in advance,
Mike

Comments

  • BLuKnightBLuKnight Lehi, UT Icrontian
    edited June 2007
    Your method is good. However, it could be time consuming. I'd recommend sorting the file first. If you're in a unix environment, you can send a command from c++ to unix telling it to sort the file. After that's done you don't need the while loops. Just run through the file once and see if the previous line matches the current line.

    Hope that helps. Great brain teaser. I had to think about it for 5 minutes.
  • edited June 2007
    BLuKnight wrote:
    Your method is good. However, it could be time consuming. I'd recommend sorting the file first. If you're in a unix environment, you can send a command from c++ to unix telling it to sort the file. After that's done you don't need the while loops. Just run through the file once and see if the previous line matches the current line.

    Hope that helps. Great brain teaser. I had to think about it for 5 minutes.

    Yeh I know, thats what im asking. The while loops arent for sort. They are to run through the file comparing check line to the previous line. And in order for it to check the previous, i believe you are goin to have to load it into a buffer. Any other suggestions?
  • shwaipshwaip bluffin' with my muffin Icrontian
    edited June 2007
    if you're in unix, and you don't _have_ to use c++:

    sort infile | uniq > outfile

    i suppose if you want to use c++ (and the file is already sorted):

    [php]
    #define BUFFER_LEN 256
    ifstream in;
    ofstream out;

    in.open("infile",ios::in);
    out.open("outfile",ios::out);

    char lastLine[BUFFER_LEN];
    char line[BUFFER_LEN];

    if(in.good())
    in.readLine(lastLine,BUFFER_LEN);

    out << lastLine;
    while(in.good()){
    in.readLine(line,BUFFER_LEN);
    if(strcmp(line,lastLine) != 0){
    out << line;
    strcpy(lastLine,line);
    }
    }
    in.close();
    out.close();
    [/php]i just wrote this here - there may be a few bugs or something.
  • edited July 2007
    I thought you can only use strcmp if the two strings are constant strings?
  • shwaipshwaip bluffin' with my muffin Icrontian
    edited July 2007
    not afaik.
Sign In or Register to comment.