Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EPIC - EEMMSS #1027

Open
3 of 12 tasks
Tracked by #1025
zanete opened this issue Sep 30, 2024 · 5 comments
Open
3 of 12 tasks
Tracked by #1025

EPIC - EEMMSS #1027

zanete opened this issue Sep 30, 2024 · 5 comments
Assignees
Labels
EPIC Used to denote an issue that represents a whole epic. Core team only
Milestone

Comments

@zanete
Copy link

zanete commented Sep 30, 2024

Why: Sub of #1025
What: A multi-model environmental impact assessment of an end-to-end web/cloud application.

Problem Statement

There are many ways to calculate the environmental impacts of software these days, many tools, some closed source, some open source, many methodologies and choices.

In the midst of all those options is the impact framework, we are often seen as a direct competitor to solutions that we are complementary to.

We are a framework that allows you to combine multiple methodologies to compute an end-to-end environmental impact for your software. We also provide a common way to communicate, aggregate, and visualize impacts.

The goal of this epic is to demonstrate that IF is a tool that:

  • Glues many other tools and methodologies together to provide an end-to-end figure for your applications impact.
  • Surface the differences between various methodologies and solutions so we can understand the trade-offs of one versus the other.
  • Can generate both a static impact value but also be run dynamically to generate more "real-time" (less-delayed) impacts.

Proposed Solution

  • We will create one web application and run it under a simulated load capturing a varied set of observations.
  • We will compute the impact of that software application using different pipelines to demonstrate both how the IF is a glue that allows you to stitch together and end-to-end report AND how the different pipelines result in different values.

The application needs to be one with these components:

  • Web Servers: Running on bare metal with kubernetes - It must support autoscaling.
  • Managed Services: Some DB, Some storage (e.g. S3), Some serverless functions
  • Networking: Obviously some network transfer from the server to the end user.
  • Browser: The website loads in a web browser.

The behaviors we want to trigger are:

  • Simulate a varied user load on the application for at least a few hours.
  • The varied load should trigger new servers to be spun up in the background and then spun back down. This dynamic changing of the number of servers is a common and complex use case to measure and track, we want to simulate it.
  • There is some data saved and retrieved from storage (an image perhaps).
  • There is some browser usage (we can model this last part no need to spin up multiple browsers)

NOTE
We should not try to create our own application from scratch here. The internet is large, there are many many many example reference architectures out there. Many public docker containers. I'm expecting for us to use something well known and public. Something anyone can themselves spin up and figure out.
We need to use kubernetes so we can use kepler as one of the energy measurement tools, it only works with kubernetes.
We will have to use a cloud provider, but let's try to make it as agnostic as possible.

The pipelines we'd like to explore:
Energy
Depending on the approach, we can try several and swap them in and out.
Look at the awesome green software list and the recent list from github to make sure we've got a large selection.

  • Kepler (& -> Energy)
  • Scaphandre (& -> Energy)
  • TeadsCurve (CPU Util -> Energy)
  • Boavizta (CPU Util -> Energy)
  • mlCO2 (CPU Util -> Energy)

Carbon Intensity

  • Static Yearly Average
  • Electricity Maps
  • WattTime

Embodied
Very few options :/

  • Boavizta?
  • TeadsEmbodied (Need to build) - instance-type / cpu-name (@asim TBD)
  • ManualCoefficient

Other

  • Storage (Hackathon solution from ScottLogic)
  • CO2.js
  • Cost -> Carbon (CCF) - For the managed services where the only observation we get is perhaps cost.

Tasks

MVP Infrastructure

Community Engagement process

  • TBC

Measurement and Modelling

  • TBC

Future

@zanete zanete added this to the IF 1.0 milestone Sep 30, 2024
@zanete zanete self-assigned this Sep 30, 2024
@zanete zanete added the EPIC Used to denote an issue that represents a whole epic. Core team only label Sep 30, 2024
@jawache
Copy link
Contributor

jawache commented Oct 3, 2024

FYI, I believe (needs to be checked) that Scaphandre only runs on baremetal servers and not VMs which is why I suggest baremetal. Kepler needs to be run with Kubernetes. Other solutions just need the CPU util which is fairly simple to get.

So baremetal servers running kubernetes will cover us I think for the majority of energy-gathering methodologies.

@zanete
Copy link
Author

zanete commented Oct 4, 2024

@jmcook1186 please review and add any thoughts before passing on to Narek

@zanete zanete assigned jmcook1186 and narekhovhannisyan and unassigned zanete Oct 4, 2024
@zanete
Copy link
Author

zanete commented Oct 9, 2024

Status update 9th Oct 2024:

  • had a kick-off call with @narekhovhannisyan and @jmcook1186
  • discussed the phases of this project and established that
    • Phase 1: Setup the application
    • Phase 2: Set up the measurement tools (sensors)
    • Phase 3: Capture data
  • Focused the meeting on figuring out Phase 1: the app
  • Listed the components the app should have vs what specific components that would translate in the app
  • Made a decision to go with Azure as the platform as they provide the most information, and we hope we can leverage the members and contributors in case we get stuck
  • The majority of the remaining components will depend on the app
  • In order to make a decision on the app and infra, it should be understood what kind of technical setup is needed for the sensors.

Next steps:


Discussion board here: https://www.figma.com/board/8wVYEmQRgPEB8nGChil03o/IF-DB-Infra-Structure-Setup?node-id=0-1&t=U7bb7xhHVgHJKSIx-1

@zanete
Copy link
Author

zanete commented Oct 17, 2024

Status update 17th Oct 2024:

  • Weekly call with @jawache @jmcook1186 and @narekhovhannisyan
  • Reviewed research done thus far
  • Agreed to proceed with a locally runnable bare-bones MVP (no db or storage, no auto-scaling, to be added in later iterations)
  • Set up just a couple of the sensors to establish a process and understand what's involved and how this could be run
  • Renamed the initiative to EEMMSS, to have it's own repo nd try and involve the community in the ongoing iterations

Next steps:

@zanete
Copy link
Author

zanete commented Oct 23, 2024

  • configured kubernetes on local, the simple app is a node.js server that returns time, it's been configured and running on cluster
  • right now connecting Kepler to the setup, and will post in slack when a response from Kepler is returned
  • Perhaps not working on MacOS
  • Can't make decisions about how to involve the community until we have some numbers from a real system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EPIC Used to denote an issue that represents a whole epic. Core team only
Projects
Status: In Progress
Development

No branches or pull requests

4 participants