pwfengine
PWF Engine

PWF Engine is the engine of the PWF framework, used to create and use web wrappers based on generic schemas.

Let's suppose you want to navigate a page of a website to extract some information from it.

Note:
a general understanding of QT it is needed to use the engine.

We will create a class MyNavigator with some members:

PEngine *engine;
PSiteWrapper *siteWrapper;
PPageWrapper *pageWrapper;

and some slots:

public slots:
void attemptSiteSchemaFinished(PAction::StatusType finishedStatus));
void attemptPageSchemaFinished(PAction::StatusType finishedStatus));

We will see how to use them later.

The first thing you have to is to create the engine class specifying the schemas root directory:

engine = new PEngine();
engine->setSchemasCandidatesDirectory("schemas/");

Now you can create the site wrapper passing the engine to it:

myWrapper = new PSiteWrapper(engine);
myWrapper->setUrl("http://www.sitename.com/");

Now the critical part: setting a schema. You have two choices:

Obviously, these are only some examples of schema usages, (ideally) you could model a generic schema that apply to all forums!

The PAction is a simple class that executes an action, in this case sets/detects the schema of a wrapper, and emits a QT Signal when it has finished. Check http://developer.qt.nokia.com/doc/signalsandslots.html for more information about the signal and slot mechanism.

Now we'll say to the system to call our attemptSiteSchemaFinished() slot when the schemaAction has finished it's job, passing to it the status of the action.

connect(attemptSchema, SIGNAL(finished(PAction::StatusType)),
        this, SLOT(attemptSiteSchemaFinished(PAction::StatusType)));

If the action has finished without errors, we can create the wrapper for the page we want to navigate and repeat the operation to set/detect its schema.

void attemptSiteSchemaFinished(PAction::StatusType finishedStatus)) {
  if (finishedStatus == PAction:StatusSuccess) {
    // our schema is ready!
    pageWrapper = new PPAgeWrapper(engine, siteWrapper);
    PAction *attemptPageSchema = pageWrapper->setSchema("lastposts");
    connect(attemptPageSchema, SIGNAL(finished(PAction::StatusType)),
            this, SLOT(attemptPageSchemaFinished(PAction::StatusType)));
  } else {
    // error
  }
}

When the page schema is ready, we can navigate it to extract the information we want:

void attemptPageSchemaFinished(PAction::StatusType finishedStatus)) {
  if (finishedStatus == PAction:StatusSuccess) {
    // our schema is ready!
    PPageElement *lastsPosts = pageWrapper->mappedPage();
    PPageElement post;
    while ((post = lastsPosts->next("post"))) {
        cout << "Title: " << post.child("title").at(0).text() << endl;
        cout << "Description: " << post.child("description").at(0).text() << endl;
    }
  } else {
    // error
  }
}
 All Classes Functions Variables Enumerations Enumerator