deft/notes/iroh_team_meeting_notes.org

298 lines
9.3 KiB
Org Mode
Raw Permalink Normal View History

2024-02-01 14:16:14 +00:00
:PROPERTIES:
:ID: 72772426-cd53-4f61-b584-7807d274c0ad
:END:
#+title: IROH Team Meeting Notes
#+Author: Yann Esposito
#+Date: [2024-01-11]
- tags ::
- source ::
* [2024-01-11 Thu] Thursday only 30 min
** Intro
This will be a short meeting because I have so many new ones.
So first happy new year everyone I hope you enjoyed your time off.
About Guillaume, is is very stressed not to be with us.
So a few things to decide.
1. Is this time ok for the Tuesday? I mean the next hour. I cannot make it later unfortunately.
2. As I would like to reduce my amount of stressful communication, I would like
to keep an up to date version of:
- topic, status, people
What is a topic?
1. PO driven topic: like the Official tasks we see during our Q3 commit
this contains, design, development, meetings, configuration, QA fixes,
being present during the related releases, admin tasks, helping QA, answering
questions in the different chat room or in DM.
2. Unexpected topics:
- discovering a major issue that need our attention ASAP.
- a new unexpected task asked by someone, perhaps a urgency
- if asked by a PM or someone in another team, do not start working
unless you are confident this would not impact any delivery prediction.
If you are not comfortable with the ask, please send it to me.
- if asked by Jyoti, work on it, but let me and the PO knows, in particular
if this affect other tasks.
** Weekly Meeting Organization
Ideally, in order not to loose as much time as possible, please put a quick
recap of your previous week in the chat, ideally 1h before the meeting.
Something quick with the following format:
- DONE (finished last week)
- DOING
- BLOCKED help needed
- TOPIC about a topic you would like to talk during the meeting
Ideally, we should only talk about the "need help/blocked/ask for discussion" points.
That way I expect to be able to focus on the top-to-bottom news at the start of
the meeting then we will try to talk about the most important topic.
If nobody propose a topic, I will probably propose one myself and we might
discuss about it.
We will probably try many different formats until we find something that is fine
for most of us.
* [2024-01-16 Tue] 30min
** Statuses
*** Ambrose
- DONE
- merged bad compojure-api usage (:return => :responses)
- DOING
- Subscription to asset scores via DI is failing with 401 response https://github.com/advthreat/iroh/pull/8699
- thanks to Mario for giving me the heads up
- experimenting with reitit for CTIA
- big task is to make equivalent to compojure.api.api/api in reitit with equivalent middleware
- some of the middleware uses implementation details of compojure-api like clj-momo.ring.middleware.metrics/wrap-metrics
- TOPIC
- shopping around for my next task to do after incident rescoring, suggestions welcome
- hearing rumors of “data lakes” that might replace ES/CTIA, ideally hop onto that bandwagon if it exists
*** Wanderson
- DONE
- merged check for QA urls in universal provisioning process to not send Okta JWT to QA invalid origin
- DOING
- short-term solution for brown field provisioning
- fighting emacs: perhaps my last upgrade was 1yr ago or so… I did a doom upgrade and things went badly. fixing it
- TOPIC
- tips on how to make your kid go back to school after 30 days at home. every day is a shitshow at the door.
*** GE
- DONE PCTIA dashboard in EU and APJC
- DOING
- created and modified in CTIM https://github.com/threatgrid/ctim/pull/439
- do not hide created and modified in CTIA
- ON HOLD:
- summarize incident at bundle import
- TOPIC:
- CTIA / ES performance issues seem mostly related to undersized IOPS that could not support the read rate during spike of bundle import.
*** Olivier
- DONE (to be merged!)
- cleanup of iroh TK config files in iroh repo
- refactoring of tenzin-config config files (bootstrap.cfg and config.edn) to reduce duplication
- added new config files per application (node types) for all envs in tenzin-config
- DOING
- working on defining the standard 'iroh' node type (to generate bootstrap file)
*** Matt
- DOING
- Capacity planning for Q3
- Meetings to prepare new features (Notifications, Mitre coverage pattern)
*** Kirill
- DONE
- fix kafka-connector --> ES data stream misconfiguration on TEST
- refactoring for both KafkaConnectService and DataStreamsService to be more generic with more declarative configuration
- DOING
- ElasticSearchSource Connector to extract data from elastic and downstream it to Kafka topic. Most likely will turn ONHOLD
- Experiment with Graph databases
- data pipeline server for data ingestion into permanent graph DB
- explore capabilities of graph databases to perform fast and much more intelligent queries
- authorisation embedded into database model (fetch only the documents user is authorised to see)
- derived facts with semantic reasoning feels like AI without actual AI :)) check this video
- TOPIC
- ElasticSearch is causing more troubles in compare with feature set of it we are using.
*** Shafiq
- DOING
- Fallback store for iroh-events
- iroh-proxy health check for slack
*** Mario
- DONE
- Split risk scoring as a task out of incident enrichment task (for release this week)
- Added max execution time limit to incident summary task (for release this week)
- Updated connection manager config in response to incident summary failures
- DOING
- Reviewing execution failures during risk scoring, enrichment, and incident summary in PROD
- Sync incident_time during incident-summary updates
*** Yann
- DONE
- (waiting for review) Track Impersonators
- DI clients update (added private-intel scope)
- DOING
- Check Quarter Topics
- Q3 Team Capacity
- [Brownfield] Attach existing SX/XDR to an existing SCC account (PIAM)
*** Patrick
- DOING
- Monitoring
** Meeting Topic points
- 3 ES-related topics
- 1 personal life kids
** Kirill ES
Asking around this question, which features are we using from ES?
Ability index unregular field?
Exploring GraphDB, promising, ability to join. Connect documents.
We will win a lot of http requests, and probably lot of improvements for our
current usage.
Also, we have a lot of data in denormalized way. Not linked data properly.
Summaries, it will be great not to save summaries, but to do query instead.
Drop-in replacement, using store service.
** Mario
Performance benefits from ES.
- @Kirill: tried RAM Graph DB, should probably work. It will shape of IROH.
- @Jerome: take care of the backup, etc… if it work correctly in PROD
ops will not maintain it.
- @Patrick: we could use SASS MongoDB platform. Not cheaper but easier, many
more IOPS.
- @Kirill: would probably need fewer IOPS if we could use another DB.
Retention 4years.
- @Jerome: cold storage ? warm storage.
- @Jerome: name production ready GraphDB?
- @Kiril: Neo4J, Neptune, ...
- @Jerome: >100 indexes in NAM
** Topics
- get rid of data (use tenant and SX EOL)
* [2024-01-30 Tue] 30min
** Statuses
*** Kirill
DOING
• Design for Notification preferences and delivery together with glueing together Notification object with NotificationRequest as a foundation for multi target delivery (one notification to email, IM and InApp)
*** Matt
DONE
• Upgraded JAMF Classic API authentication (basic auth -> token auth)
*** Olivier
DOING
• MITRE ATT&CK Coverage Mapping: design of: Import of Talos MITRE coverage files
*** Wanderson
DONE
• Brownfield provisioning tac API
• Support for FMC JWT in IROH
DOING
• FMC Proxy for OAuth2 and SSE requests
*** GE
DONE
• managing SE attack on iroh async
• stats for PM: https://github.com/advthreat/iroh/issues/8853
DOING:
• MITRE mapping design
*** Mario
DONE
• Session log maintenance PR to address long-running sessions consuming Redis memory in iroh-async
DOING
• Queue inspection/management tools for iroh-admin
*** Yann Esposito
DONE jwt middleware to support JWT without nbf claim
• DONE Easy impersonate for TAC
• DONE Fix PIAM endpoints
• DONE Attach Tenant for Superball (P1)
• DOING following incident promotion issue; false positive from Talos + SE events
• DOING Q3 workload preparation
• DOING Help:
• Meraki Integration (lots of OAuth2 related questions)
• Automation to use two clients.
• ES cleanup
• Discuss Impersonation use cases for Efficiency team (Petr)
• Discuss Impersonation risks with Chris Duane
• Discuss Impersonation for TAC Portal
• Ihor about expectation of legacy provisioning
• Follow Universal Provisioning testing
*** Shafiq
DONE
• Fallback store for iroh-events
• iroh-proxy health check for slack
• DOING
• iroh-proxy authentication for Checkpoint API
*** Patrick
DOING: ddog pg monitoring manuals test ok, now I working on integration in tenzin's salt and tf
*** Ambrose
DONE:
• redesigned incident asset rescoring pipeline to be simpler https://github.com/advthreat/iroh/issues/8824
DOING:
• implementing it https://github.com/advthreat/iroh/pull/8843
• continuously gathering requirements tweaking the design
*** Jerôme
DOING:
• MSK migrationon auth cluster (testing iroh conf)
• improving alerts
DONE:
• add some alerts on DD
** Topics
*** Plan to prevent future incident filling the queue?
- can we support more than one event concurrently?
- where should we invest our time?