Batch-processing in MATLAB
geodave
Arizona
Hello All!
I'm fairly new to MATLAB and have been given a task that is quite challenging to me.
I have a text file that contains four columns and many rows (on the order of a few hundred thousand rows). I'm trying to write a script in MATLAB that will read the rows and columns from this file, and write only the first three columns into a new text file.
Here's what the original dataset in the text file looks like:
I have written the following script in MATLAB that works very well for a small dataset (a few thousand rows). It creates two text files; one with a header (x, y, z) and another without a header.
My problem is that I'm not quite sure how to do this (if this is the right approach, that is). Any help/suggestions/tips would be greatly appreciated!
I'm fairly new to MATLAB and have been given a task that is quite challenging to me.
I have a text file that contains four columns and many rows (on the order of a few hundred thousand rows). I'm trying to write a script in MATLAB that will read the rows and columns from this file, and write only the first three columns into a new text file.
Here's what the original dataset in the text file looks like:
-16.517754 3.610515 -0.847929 30 -16.472557 3.611480 -0.845726 28 -16.477274 3.617941 -0.846026 30 -16.433626 3.616477 -0.843872 32 -16.431351 3.626801 -0.843872 30 -16.424358 3.630670 -0.843572 32 -16.406473 3.637529 -0.842770 32 -16.406305 3.642901 -0.842820 34 -16.403439 3.655784 -0.842820 34 -16.409687 3.659884 -0.843171 32Basically, I want columns 1,2, and 3, but not column 4, to be written in the new text file.
I have written the following script in MATLAB that works very well for a small dataset (a few thousand rows). It creates two text files; one with a header (x, y, z) and another without a header.
clear all % CHANGE THIS TO YOUR FILE'S NAME xyz_data = load('serrano_all_data.txt'); x = xyz_data(:,1); y = xyz_data(:,2); z = xyz_data(:,3); B = [x y z]; % new 3-column matrix % no header dlmwrite('xyz_no_header.txt',B); % comma delimited % with headers fid = fopen('xyz_header.txt','w+t'); fprintf(fid,'x,y,z\n',B); % writes headers to text file fclose(fid); dlmwrite('xyz_ArcMap.txt',B,'-append'); % comma delimited, appends new xyz matrix to text fileHowever, when I try running this script on a much larger dataset (16 million rows), I get the following error in MATLAB:
??? Error using ==> horzcat Out of memory. Type HELP MEMORY for your options. Error in ==> three_column_utility at 22 B = [x y z]; % new 3-column matrixI have a suspicion I'm getting this error because the dataset is very large. When I asked a friend who is more familiar with MATLAB than I, he suggested I process the data in batch rather than all at once. For example, the script should read the first thousand lines and write them into the new text file using the new format (3 columns instead of 4 columns), then move on to the next thousand lines and append those to the bottom of the first thousand lines in the new text file using the new format (3 columns instead of 4 columns), and so on, until the end of the file is reached (using the feof command, I think).
My problem is that I'm not quite sure how to do this (if this is the right approach, that is). Any help/suggestions/tips would be greatly appreciated!
0
Comments
rather than saying:
you can just address the columns of xyz:
So, your problem is that you're creating 3 copies of the data in your memory. below should only be 1 copy.
i think this should work.
It's different than the error I used to get when I used my old script in that MATLAB wasn't happy using "horzcat", whereas now MATLAB isn't happy using "load".
What are your thoughts on this? Again, thanks for spending the time in helping me with this.
Basically, you're running out of memory. You're right that the file is way too long.
try this (it'll probably be slow):
using a computer with greater RAM... and it worked! It took ~39 minutes to process the ~0.6 Gb text file, but it did it flawlessly!
I appreciate your help with this. Thank you!