2017-03-20 RackHD Architecture Forum Meeting notes

Date

Attendees

Proposed Agenda

  • Review AIs
    • Maithri - add footprint data for microservices (AI)
  • Questions regarding Dell Server Team interlock
    • Maithri - indicated discussions with Brian Doty re: Power Monitoring, RBAC and VLAN management use cases (AI)
  • Collateral status and review - next steps
    • Team needs to review materials and provide feedback to authors and rest of team (in Confluence)
    • Needs to be inclusive of Neighborhood Manager and InfraEnablers, so long as the talk track (textual info that goes with diagrams) explains that some of this technology are concepts being developed and extensions to the original RackHD architecture
  • Discussion - WFEaaS - standalone workflow engine
    • There is an on-going effort for removing dependence on on-http
      • WIP - developing a native HTTP interface for on-taskgraph based on a subset of RackHD 2.0 APIs
      • Subset includes those used to trigger workflows
      • Also working on adding support for gRPC to allow for RPC semantics to drive workflows, when HTTP is not needed or required
    • Dependence on MongoDB
      • Will be very challenging
      • Today, an abstraction called a "store" exists which is based on using MongoDB as a persistence backend
      • Could potentially develop another "store" which leverages a local FS
      • Problem – "store" is based on MongoDB specific semantics – semantics are not abstracted
      • Almost all WFE state is stored in MongoDB with the exception of poller state, which is kept in memory
      • WFE is able to recover any workflow if processes/containers were to crash
    • Native HTTP and gRPC changes are currently sitting in a feature branch/fork ("modularity")
      • PR exists, but realized test coverage was not good enough, so working to improve test coverage
    • Lexington (an OVA) will likely want to consume microservices as JARs, not sure about RackHD (will discuss with server team on 3/31/17)
    • DHCP/TFTP
      • Today, tightly coupled to certain workflows (i.e. discovery, inventory, cataloging, OS installs, etc.) that are based on PXE or iPXE
      • If workflows that are run do not depend on this and also do not depend on having RackHD assign IPs, then DHCP/TFTP can be eliminated from a WFE deployment
    • Dell PE servers with iDRAC
      • Workflows can leverage iDRAC virtual media
      • Current OS deployment microservice mounts virtual media (some sort of NFS/CIFS share with install media)
      • RackHD would be present to coordinate workflows
        • Take user promps, create ISOs to on vitual media, take results from steps and feed into subsequent microservice RESTful calls, etc.
      • If Lex is already able to perform coordination, it is possible they may only want to consume microservice and not RackHD – will discuss on 3/31

Action items

  • Former user (Deleted) - collect footprint metrics for microservices to have available for next interlock with Dell Server team, Confluence site containing the information has been shared with the team
  • Former user (Deleted) - provide information re: power monitoring, RBAC and VLAN Management use cases to Former user (Deleted) and Tom Capirchio in preparation for Tom's syn-up meeting with Dell Server PdMs on 3/28/17 - Maithri sent an email to Tom Cap, Byron and Joe
  • Former user (Deleted) - follow up with IT on problems with Arch Forum DL in both directions - Email has been sent to Tina and she will follow up with IT
  • Former user (Deleted) - add goals and success criteria to  RI-132 - Getting issue details... STATUS
  • Team - provide feedback to architecture collateral in the respective Confluence spaces