Added experimental background job queue #20985

9larsons · 2024-09-12T02:14:29Z

ref ???

added background job queue behind config flags
when enabled, is only used for the member email analytics updates in order to speed up the parent job, and take load off of the main process that is serving requests

Added config flags for the job queue to make it switchable config options are in services:jobs:queue and are 'enabled,reportStats,reportInterval,maxWorkers,logLevel(info/debug),pollMinInterval,pollMaxInterval,queueCapacity,fetchCount.

github-actions · 2024-09-12T02:14:43Z

cmraible · 2024-09-12T17:49:12Z

ghost/core/core/server/data/schema/schema.js

@@ -985,7 +985,9 @@ module.exports = {
        started_at: {type: 'dateTime', nullable: true},
        finished_at: {type: 'dateTime', nullable: true},
        created_at: {type: 'dateTime', nullable: false},
-        updated_at: {type: 'dateTime', nullable: true}
+        updated_at: {type: 'dateTime', nullable: true},
+        metadata: {type: 'text', maxlength: 1000000000, fieldtype: 'long', nullable: true},


Do you have a sense of how big we'd expect this field to actually be? I wonder if allowing a maxlength this large is potentially encouraging misuse of this field?

I originally had it like other text fields at 2000 chars. I think that ought to be sufficient, but this is why I was holding off on the migration. We could start with 2000 and go from there?

Yeah 2000 seems more sane to me — in most cases this should basically be a resource ID and maybe a couple small pieces of metadata (e.g. a timestamp or two, maybe a url pointing to a file or storage bucket, etc.)

Cool, I'll bump back. Basically, since this is a 'JSON' field I had copied what we had elsewhere. Agree it seems to be overkill and it's easier to bump up than down.

Adjusted. If I mark this resolved, does it also resolve it for you? I forget how GH handles that.

ghost/core/core/server/data/schema/schema.js

ghost/core/core/server/services/email-analytics/jobs/index.js

cmraible · 2024-09-12T18:01:49Z

ghost/core/core/server/services/mentions-jobs/job-service.js

@@ -42,7 +42,7 @@ const initTestMode = () => {
    }, 5000);
 };

-const jobManager = new JobManager({errorHandler, workerMessageHandler, JobModel: models.Job, domainEvents});
+const jobManager = new JobManager({errorHandler, workerMessageHandler, JobModel: models.Job, domainEvents, isDuplicate: true});


Not a blocker but just a note for future: it would be good to get to the bottom of why we need two instances of the job system, and ideally fix that so we don't need this anymore

Agreed! I've got a note to look into this because this is rather clunky way of handling it that I'm also really not a fan of.

ghost/job-manager/lib/JobManager.js

cmraible · 2024-09-12T19:22:15Z

ghost/job-manager/lib/JobQueueManager.js

+     * 
+     * @returns {Promise<void>}
+     */
+    async startQueueProcessor() {


Just for the sake of readability and testability I think it might be good to break this bad boy up a bit, it's a little hard for me to follow in its current form.

I did a pretty significant refactor of that entire class to break it up a lot, and expose more so testing is easier.

This is slightly less efficient - potentially - because we do a select before inserting instead of ignoring the insert conflict. It's likely less efficient because I don't anticipate a ton of duplicates, although the analytics job can certainly result in that, which is where I'd expect the knex implementation to slightly win out. Regardless, testing is a fucking nightmare with knex directly as we have to spin up a db and use a schema for the table. Let's go down that path later if we need the performance improvements.

9larsons · 2024-09-16T20:16:10Z

Need to add a couple integration tests then we ought to be ok. Not sure what the failure for Ghost-CLI is.

vikaspotluri123 · 2024-09-17T00:36:32Z

It looks like there's an error that's preventing the persisted logs from being printed 🙃 Here's the culprint:

Ghost was able to start, but errored during boot with: Cannot find module 'workerpool'

Looking at the lockfile changes, I'm not sure if they're intentional - there are a lot of new dependencies, which I think might stem from an older version of @tryghost/errors
Also, the real issue is that it looks like workerpool wasn't added as a dependency?

- moved to a more appropriate location (integration tests from e2e-server)

9larsons · 2024-09-20T11:29:33Z

Ok, clearly there's some kind of race condition with the config mocks and initializing the service. I will have to look into that when I can reprioritize this.

9larsons added 2 commits September 11, 2024 18:11

Added metadata and queue_entry to jobs schema

fa3773a

Added and wired up the jobs queue into JobManager

0944f8e

Added config flags for the job queue to make it switchable config options are in services:jobs:queue and are 'enabled,reportStats,reportInterval,maxWorkers,logLevel(info/debug),pollMinInterval,pollMaxInterval,queueCapacity,fetchCount.

github-actions bot added the migration [pull request] Includes migration for review label Sep 12, 2024

cmraible reviewed Sep 12, 2024

View reviewed changes

ghost/core/core/server/data/schema/schema.js Show resolved Hide resolved

cmraible reviewed Sep 12, 2024

View reviewed changes

ghost/core/core/server/services/email-analytics/jobs/index.js Outdated Show resolved Hide resolved

cmraible reviewed Sep 12, 2024

View reviewed changes

ghost/job-manager/lib/JobManager.js Show resolved Hide resolved

Revert the schedule change for the email analytics job

cbfb1a2

cmraible reviewed Sep 12, 2024

View reviewed changes

9larsons added 9 commits September 12, 2024 15:33

Updated metadata field to be a 2000 char string value

d74d339

Updated refs to jobManager.queue to be inlineQueue

e3b68ec

Refactored queue processor code to be more testable/readable

f47448b

Added bulk of unit tests

4b3b928

Fixed linting

85c28ea

Fixed unit tests

010369b

Fixed linting

62807b9

Fixed db unit test for schema change

e253203

9larsons added 8 commits September 17, 2024 09:11

Add workerpool

125f3d7

Added and updated jobs integration tests

7aeb7e0

- moved to a more appropriate location (integration tests from e2e-server)

Fixed linting

5508cfd

Attempt longer timeout for CI

c602c67

add more debug logging

2cd905f

Add even more logging

e5529c0

closure

aa8fc8b

.

6694210

9larsons added 3 commits September 18, 2024 10:38

.

2fd8caf

.

22b76b0

,..

1d77f5a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added experimental background job queue #20985

Added experimental background job queue #20985

9larsons commented Sep 12, 2024

github-actions bot commented Sep 12, 2024

cmraible Sep 12, 2024

9larsons Sep 12, 2024

cmraible Sep 12, 2024

9larsons Sep 12, 2024

9larsons Sep 12, 2024

cmraible Sep 12, 2024

9larsons Sep 12, 2024

cmraible Sep 12, 2024

9larsons Sep 16, 2024

9larsons commented Sep 16, 2024

vikaspotluri123 commented Sep 17, 2024

9larsons commented Sep 20, 2024

Added experimental background job queue #20985

Are you sure you want to change the base?

Added experimental background job queue #20985

Conversation

9larsons commented Sep 12, 2024

github-actions bot commented Sep 12, 2024

General requirements

Schema changes

Data changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

9larsons commented Sep 16, 2024

vikaspotluri123 commented Sep 17, 2024

9larsons commented Sep 20, 2024