Skip to content
This repository has been archived by the owner on Dec 15, 2022. It is now read-only.

Fix race condition when sending daily stats #25

Closed
wants to merge 4 commits into from

Conversation

annthurium
Copy link

@annthurium annthurium commented Apr 1, 2019

Problem

@rafeca (and @shana) pointed out that there is a race condition with reporting stats. If we have multiple client windows open, we could end up sending the same stats twice. The db is shared across all client instances, and the database is cleared asynchronously after waiting for the request to finish.

Solution

The fix is to add an isReporting class attribute, which we set to true at the beginning of the reportStats method and re-set to false at the end. Since the StatsStore class is a singleton, and nodejs event loop is single threaded*, hopefully this will prevent multiple running windows from trying to send data at the same time.

Alternate approaches

I also considered making each window have its own db instance. Shana also pointed out that there might be race conditions on writes in #21. However, I did a little research and it looks like lokijs throttles automatic saves when these would cause overlaps with a save currently in progress. This prevents data loss at the cost of less frequent saves. So I think we're ok here, but I'm open to changing this if you disagree.

@annthurium annthurium requested a review from rafeca April 1, 2019 22:56
@rafeca
Copy link
Contributor

rafeca commented Apr 2, 2019

Wow this was fast! Thanks for working on this!! 🎉

The fix is to add an isReporting class attribute, which we set to true at the beginning of the reportStats method and re-set to false at the end. Since the StatsStore class is a singleton, and nodejs event loop is single threaded*, hopefully this will prevent multiple running windows from trying to send data at the same time.

I'm pretty sure that each electron window runs in a separate renderer process, in a similar way that chrome does for each tab, so I don't think that this solution would be effective.

Instead, we can use the localStorage to create a simple mutex, since it is shared between the different renderer processes. That solution would be very similar than this PR, but instead of keeping the isReporting value in memory it should be read/written from localStorage (note that localStorage is not guaranteed to be thread-safe, but in this case this should not be an issue).

@rafeca
Copy link
Contributor

rafeca commented Apr 2, 2019

After looking at this I just got curious about lokijs and how it implemented the storage of events and from what I saw by default lokijs is an in-memory database and does not persist the data by default (so each window currently has its own list of events).

This means that race conditions don't really exist at the moment: each window currently keeps its list of events but they "conflict" when checking the last-daily-stats-report data from localStorage.

But if I'm not mistaken, this also means that we're currently losing all the events that haven't been sent once the Atom window gets closed, and since we're only sending events once Atom has been opened for 4h, we're losing any event from sessions shorter than 4h.

Using a persistance adapter on LokiJS (as mentioned in the docs) would fix this specific problem, but then we would introduce the concurrency issues when having multiple windows open, since lokijs only syncs the data from the persistance adapter on load:

An important distinction between an in-memory database like lokijs and traditional database systems is that all documents/records are kept in memory and are not loaded as needed. Persistence is therefore only for saving and restoring the state of this in-memory database.

So I see two potential paths forward (there may be more options):

  • Use a global db instance for all the windows but move away from lokijs and use a system that ensures that multiple concurrent clients are handled correctly (even by accessing IndexedDB directly). This could cause some performance regressions if we have many events, since for every read/write we would need to access the storage system (and serialize/deserialize the data).
  • Keep using lokijs as an in-memory DB (so each session has its own instance) but persist the data that has not been sent when Atom is closed. Then re-read the unsent data (and delete it from the persistant store immediately) the next time an Atom window gets opened.

@annthurium
Copy link
Author

lokijs is an in-memory database and does not persist the data by default (so each window currently has its own list of events).

🤦‍♂️I did not even realize that lokijs is in memory and doesn't persist. Thanks for bringing that up! A lesson to me to investigate tools more deeply.

Use a global db instance for all the windows but move away from lokijs and use a system that ensures that multiple concurrent clients are handled correctly (even by accessing IndexedDB directly). This could cause some performance regressions if we have many events, since for every read/write we would need to access the storage system (and serialize/deserialize the data).

This seems simpler to implement, since I can imagine more concurrency bugs that might shake out of the other approach, if we have multiple windows that are writing to disk when they close, and also trying to send stats.

Since I don't love the plain indexdb api, I evaluated two other indexdb wrappers.

pouchDB

pros

  • used in other Electron apps
  • nice api for async transactions

cons

  • the docs don't mention explicit typescript support
  • has a lot of features we don't necessarily need (like syncing to a couchdb instance) so might be overpowered for our purposes
  • the docs tell you to bower install to get started. what year is this, 2014? 😆 do these folks stan bower or are their docs just really out of date?
  • bigger bundle size than dexie (113.6 kB unzipped) but tbh probably not big enough to matter

dexie

https://dexie.org/docs/API-Reference#quick-reference

pros

cons

  • ???

Dexie is the clear winner here, so I'll start implementing a move from lokijs to dexie.

@rafeca please let me know if I'm missing anything, or if you have any concerns about this approach. Thanks for your help!

@rafeca
Copy link
Contributor

rafeca commented Apr 23, 2019

Hey @annthurium, sorry for the delayed response, I was on PTO until today 🏖

Thanks for investigating potential solutions and writing down the differences! I agree with your outcome and dexie looks like a good solution (alternatively we could use directly the IndexedDB API to save the 55KB since we don't need complex queries logic, but that API is a nightmare to work with 😅, and as you mention 55KB is not a big deal for Atom).

@rafeca
Copy link
Contributor

rafeca commented Jun 3, 2019

Closing this PR in favor of #29

@rafeca rafeca closed this Jun 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants