PDA

View Full Version : C++ Removing Duplicates


█Mike█
23 Jun 2007, 04:57am
Hello guys, I'm trying to remove duplicate strings in a text file. I just need some guidance. The first is would it be better to use nested while loops?: while(!data_input.eof)
I have tried using nested for loops such as

<P> for(int current_line=0;current_line<start_number_lines;current_line++)
<br>{
<br>get line from text
<br>for(int check_line=(current_line+1);current_line<start_number_lines;check_line++)
<br>{
<br>get line from text
<br>check if both are same
<br>}
<br>if they the same
<br>{
<br>data_output<<current_line;
<br>}
<br>} </P>

Also please ignore the <"P"> i couldn't get it to write the code down.
But nothing has been working for me. I don't think i should nest the while loops, and when i count number of lines using a function that runs through the lines of the text file, every where else the file is referenced for input, it starts at eof so im getting blank lines. I really know I have to use buffers, but not clear on how to call each line back from the buffer, wouldn't it be a lot of pointers?

Thanks in advance,
Mike

BLuKnight
26 Jun 2007, 07:13am
Your method is good. However, it could be time consuming. I'd recommend sorting the file first. If you're in a unix environment, you can send a command from c++ to unix telling it to sort the file. After that's done you don't need the while loops. Just run through the file once and see if the previous line matches the current line.

Hope that helps. Great brain teaser. I had to think about it for 5 minutes.

█Mike█
29 Jun 2007, 03:57pm
Your method is good. However, it could be time consuming. I'd recommend sorting the file first. If you're in a unix environment, you can send a command from c++ to unix telling it to sort the file. After that's done you don't need the while loops. Just run through the file once and see if the previous line matches the current line.

Hope that helps. Great brain teaser. I had to think about it for 5 minutes.

Yeh I know, thats what im asking. The while loops arent for sort. They are to run through the file comparing check line to the previous line. And in order for it to check the previous, i believe you are goin to have to load it into a buffer. Any other suggestions?

shwaip
29 Jun 2007, 04:35pm
if you're in unix, and you don't _have_ to use c++:

sort infile | uniq > outfile

i suppose if you want to use c++ (and the file is already sorted):


#define BUFFER_LEN 256
ifstream in;
ofstream out;

in.open("infile",ios::in);
out.open("outfile",ios::out);

char lastLine[BUFFER_LEN];
char line[BUFFER_LEN];

if(in.good())
in.readLine(lastLine,BUFFER_LEN);

out << lastLine;
while(in.good()){
in.readLine(line,BUFFER_LEN);
if(strcmp(line,lastLine) != 0){
out << line;
strcpy(lastLine,line);
}
}
in.close();
out.close();
i just wrote this here - there may be a few bugs or something.

█Mike█
11 Jul 2007, 01:34pm
I thought you can only use strcmp if the two strings are constant strings?

shwaip
11 Jul 2007, 05:45pm
not afaik.