Skip to content

[Question] Where should i set the content obtained from http request ? #519

Answered by ato
naveen17797 asked this question in Q&A
Discussion options

You must be logged in to vote

Assuming your content is supplied by a InputStream called stream then something like this will probably work:

Recorder recorder = curi.getRecorder();
recorder.markContentBegin();
recoredr.inputWrap(stream);
recorder.getRecordedInput().readFully();
recorder.closeRecorders();

handleCapturedRequest() in ExtractorChrome may be a relevant example of integrating Heritrix with a headless browser. Although keep in mind that's for recording subrequests on a background thread and so has to jump through a lot more hoops. Whereas since since you're writing a Fetch processor you don't have to setup your own recorder and can use the one already supplied by the ToeThread and similarly don't need to call…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by ato
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
2 participants
Converted from issue

This discussion was converted from issue #438 on September 30, 2022 00:45.