C++ program to search a html file for words
leishi85
Grand Rapids, MI Icrontian
i'm starting to work on a program that search a html file for words.
The user is going to input the html file to search for, adn then put in how many words, and then type in the words.
When searching throught the html file, it's only going to look through the body part of the file for the searching words, and then make the words bold and color the background.
and this is going to be case insensitive.
the problem is that i'm not quiet sure what approach to take. i know i have to use strings and vectors, and mainuplate the input data, so case won't be an issue. but not too sure how to do this.
The user is going to input the html file to search for, adn then put in how many words, and then type in the words.
When searching throught the html file, it's only going to look through the body part of the file for the searching words, and then make the words bold and color the background.
and this is going to be case insensitive.
the problem is that i'm not quiet sure what approach to take. i know i have to use strings and vectors, and mainuplate the input data, so case won't be an issue. but not too sure how to do this.
0
Comments
Not sure how much the app is now-- i used it over a year ago at my old job, so i didn't have to worry about price, and i am too lazy to look right now.
Might get you going in the right direction....
//edit: for case insensitive just use read in the string and make a temporay copy using .toupper or .tolower so you have a case insensitive string, but you will still have the original when you are finally ready to write out to your file.
Sorry.
My recommendation is break up the problem. If you've already learned the basis for object oriented design and classes, you're ahead of the game.
Break up your code into modules. In this case you'll probably want to create a parser. This module will be responsible for bringing in the input from the web page. Next you'll need a module that determines if the incoming word is to be acted upon. IE, if you haven't hit the <body> tag, then it should be looking at the word. As this module finds the word, it should do something special with that. (I'm not giving away all of the answers).
Then you should have a module to handle the output. Depending how this module is used, it will be told to output normally or by highlighting the word.
Hope that helps.
it's not compiling and not too sure if it's gonna work as it should
any idea guys??
[php]
#include <iostream>
#include <fstream>
#include <cctype>
#include <string>
#include <vector>
using namespace std;
int main()
{
cout << "File name: ";
string name;
cin >> name;
ifstream in(name.c_str());
string fileData;
in >> fileData;
while (in.fail())
{
cout << "Please enter a new file name: " << endl;
cin >> name;
ifstream in(name.c_str());
in >> fileData;
}
//=============================================================================
//brings all string into CAPITALS (only function used in program)
for (int i=0; i < name.size(); i= i + 1)
{
name=toupper(name);
}
//==============================================================================
//pick i number of words to search for ... prompts for each
vector<string>keywords;
int keynum;
string keyword;
cout << "Type in the number of key words you want to look for: " << endl;
cin >> keynum;
for (int count = 0; count <= keynum; count++)
{
cout << "Please enter the keyword to search for: " << endl;
cin >> keynum;
for (int count = 0; i < keyword.size(); i = i+1)
{
keyword = toupper(keyword);
}
keywords.push_back(keyword);
}
//==============================================================================
//bring the html file in line by line and vreak apart by whitespace
int count1=0 ;
int c=0;
string line;
getline(name, line);
bool condition=false;
vector<string> words;
while (condition==false)
}
while (count1 < line.size())
{
char spacesearch = line[count1];
if (isspace(spacesearch))
{
string word = line.substring(count1-c, count 1-1);
c = 1;
// write word into a vector
words.push_back(word);
}
//identifies the point at which </body exists and ends the entire loop
if (spacesearch=='<' and line[count1+1]=='/' and line[count1=2]=='B')
{
condition=true;
}
count1++;
c++;
}
getline(name,line);
}
//==============================================================================
//bring in one key word from the vector keyword(s) and compare it to each word
//found in the word(s) vector
//then bring in the next key word and repeat
int wordsize=words.size();
int h=0;
int j=0;
while (h<=keynum)
{
string searchword=keywords[h];
while (j<=wordsize)
{
string comparison=words[j];
if (searchword==comparison)
{
//colorizination
}
j++;
}
h++;
}
return(0);
}
//==============================================================================
//==============================================================================
[/php]
I also put in comments so you can see where I modified or corrected code. This should make it easier to understand where you made a few mistakes in syntax.
I hope this helps. C++ is a lot of fun and after you get C down, PHP is a snap. What compiler are you using for your C++ programming?
[PHP]
#include <iostream>
#include <fstream>
#include <cctype>
#include <string>
#include <vector>
using namespace std;
int main()
{
cout << "File name: ";
string name;
cin >> name;
ifstream in(name.c_str());
string fileData;
in >> fileData;
while (in.fail())
{
cout << "Please enter a new file name: " << endl;
cin >> name;
ifstream in(name.c_str());
in >> fileData;
}
//================================================== ===========================
//brings all string into CAPITALS (only function used in program)
// Changed i=i+1 to i++
for (int i=0; i < name.size(); i++)
{
name=toupper(name);
}
//================================================== ============================
//pick i number of words to search for ... prompts for each
vector<string>keywords;
int keynum;
string keyword;
cout << "Type in the number of key words you want to look for: " << endl;
cin >> keynum;
for (int count = 0; count <= keynum; count++)
{
cout << "Please enter the keyword to search for: " << endl;
cin >> keynum;
// ERROR HERE
// replaced int count = 0 with int i = 0
for (int i = 0; i < keyword.size(); i = i+1)
{
keyword = toupper(keyword);
}
keywords.push_back(keyword);
}
//================================================== ============================
//bring the html file in line by line and vreak apart by whitespace
int count1=0;
int c=0;
string line;
getline(name, line);
bool condition=false;
vector<string> words;
/* Old code
while (condition==false)
}
*/
// New Code
while (condition==false) {
while (count1 < line.size()) {
char spacesearch = line[count1];
if (isspace(spacesearch)) {
// ERROR
//string word = line.substring(count1-c, count 1-1);
// FIX
string word = line.substr(count1-c, count1-1);
c = 1;
// write word into a vector
words.push_back(word);
}
//identifies the point at which </body exists and ends the entire loop
// ERROR if (spacesearch=='<' and line[count1+1]=='/' and line[count1=2]=='B') {
if (spacesearch=='<' && line[count1+1]=='/' && line[count1=2]=='B') {
condition=true;
}
count1++;
c++;
}
getline(name,line);
}
//================================================== ============================
//bring in one key word from the vector keyword(s) and compare it to each word
//found in the word(s) vector
//then bring in the next key word and repeat
int wordsize=words.size();
int h=0;
int j=0;
while (h<=keynum)
{
string searchword=keywords[h];
while (j<=wordsize)
{
string comparison=words[j];
if (searchword==comparison)
{
//colorizination
}
j++;
}
h++;
}
return(0);
}
//================================================== ============================
//================================================== ============================
[/PHP]
[PHP]
#include <iostream>
#include <fstream>
#include <cctype>
#include <string>
#include <vector>
using namespace std;
int main() {
cout << "File name: ";
string name;
cin >> name;
/*
ifstream in(name.c_str());
string fileData;
in >> fileData;
while ( in.fail() ) {
cout << "Please enter a new file name: " << endl;
cin >> name;
ifstream in(name.c_str());
in >> fileData;
}
*/
fstream fin;
fin.open(name.c_str(), fstream::in);
while ( !fin.is_open() ) {
cout << "Please enter a new file name: " << endl;
cin >> name;
fin.open(name.c_str(), fstream::in);
}
//================================================== ===========================
//brings all string into CAPITALS (only function used in program)
// Changed i=i+1 to i++
for (int i=0; i < name.size(); i++) {
name=toupper(name);
}
//================================================== ============================
//pick i number of words to search for ... prompts for each
vector<string> keywords;
int keynum;
string keyword;
cout << "Type in the number of key words you want to look for: " << endl;
cin >> keynum;
for (int count = 0; count <= keynum; count++)
{
cout << "Please enter the keyword to search for: " << endl;
// ERROR
// OLD cin >> keynum;
cin >> keyword;
// ERROR HERE
// replaced int count = 0 with int i = 0
for (int i = 0; i < keyword.size(); i = i+1) {
keyword = toupper(keyword);
}
keywords.push_back(keyword);
}
//================================================== ============================
//bring the html file in line by line and vreak apart by whitespace
/*
* NOTE:
* This part of your code needs work. I'm not sure, but I think you should be
* ignoring HTML tags and you also need to check if you've hit the <BODY> tag yet.
* You also need to check to see if you've hit the </BODY> tag.
*/
int count1=0;
int c=0;
string line;
// ERROR: Fixed getline
//getline(name, line);
getline(fin, line);
bool condition=false;
vector<string> words;
/* Old code
while (condition==false)
}
*/
// New Code
while (condition==false) {
while (count1 < line.size()) {
char spacesearch = line[count1];
if (isspace(spacesearch)) {
// ERROR
//string word = line.substring(count1-c, count 1-1);
// FIX
string word = line.substr(count1-c, count1-1);
c = 1;
// write word into a vector
words.push_back(word);
}
//identifies the point at which </body exists and ends the entire loop
// ERROR if (spacesearch=='<' and line[count1+1]=='/' and line[count1=2]=='B') {
if (spacesearch=='<' && line[count1+1]=='/' && line[count1=2]=='B') {
condition=true;
}
count1++;
c++;
}
// ERROR: Fixed getline
//getline(name, line);
getline(fin, line);
}
//================================================== ============================
//bring in one key word from the vector keyword(s) and compare it to each word
//found in the word(s) vector
//then bring in the next key word and repeat
int wordsize=words.size();
int h=0;
int j=0;
while (h<=keynum) {
string searchword=keywords[h];
while (j<=wordsize) {
string comparison=words[j];
if (searchword==comparison) {
//colorizination
}
j++;
}
h++;
}
return(0);
}
//================================================== ============================
//================================================== ============================
[/PHP]
Using the simple code where a=1, b=2, c=3 etc a word can be assigned a (non-unique) score. For example computer=105. I want to write a program that can count the number of words in a text file that have a score specified by the user of the program. I want to use the program to find the number of words in a file that have a score of exactly 100.
Any help would be much appreciated
Mike