Batch-processing in MATLAB
Hello All!
I'm fairly new to MATLAB and have been given a task that is quite challenging to me.
I have a text file that contains four columns and many rows (on the order of a few hundred thousand rows). I'm trying to write a script in MATLAB that will read the rows and columns from this file, and write only the first three columns into a new text file.
Here's what the original dataset in the text file looks like:
Code:
-16.517754 3.610515 -0.847929 30
-16.472557 3.611480 -0.845726 28
-16.477274 3.617941 -0.846026 30
-16.433626 3.616477 -0.843872 32
-16.431351 3.626801 -0.843872 30
-16.424358 3.630670 -0.843572 32
-16.406473 3.637529 -0.842770 32
-16.406305 3.642901 -0.842820 34
-16.403439 3.655784 -0.842820 34
-16.409687 3.659884 -0.843171 32
Basically, I want columns 1,2, and 3, but not column 4, to be written in the new text file.
I have written the following script in MATLAB that works very well for a small dataset (a few thousand rows). It creates two text files; one with a header (x, y, z) and another without a header.
Code:
clear all
% CHANGE THIS TO YOUR FILE'S NAME
xyz_data = load('serrano_all_data.txt');
x = xyz_data(:,1);
y = xyz_data(:,2);
z = xyz_data(:,3);
B = [x y z]; % new 3-column matrix
% no header
dlmwrite('xyz_no_header.txt',B); % comma delimited
% with headers
fid = fopen('xyz_header.txt','w+t');
fprintf(fid,'x,y,z\n',B); % writes headers to text file
fclose(fid);
dlmwrite('xyz_ArcMap.txt',B,'-append'); % comma delimited, appends new xyz matrix to text file
However, when I try running this script on a much larger dataset (16 million rows), I get the following error in MATLAB:
Code:
??? Error using ==> horzcat
Out of memory. Type HELP MEMORY for your options.
Error in ==> three_column_utility at 22
B = [x y z]; % new 3-column matrix
I have a suspicion I'm getting this error because the dataset is very large. When I asked a friend who is more familiar with MATLAB than I, he suggested I process the data in batch rather than all at once. For example, the script should read the first thousand lines and write them into the new text file using the new format (3 columns instead of 4 columns), then move on to the next thousand lines and append those to the bottom of the first thousand lines in the new text file using the new format (3 columns instead of 4 columns), and so on, until the end of the file is reached (using the feof command, I think).
My problem is that I'm not quite sure how to do this (if this is the right approach, that is). Any help/suggestions/tips would be greatly appreciated!