Core Committer Weekly Interlock - September 28th 2017

Attendees

Thomas Sullivan

Former user (Deleted)

Former user (Deleted)

Bishoy youssef

Leo Zhang

Michael Hepfer

Former user (Deleted)

Former user (Deleted)

Ahmed Osama

Former user (Deleted)

Former user (Deleted)

James Turnquist

Agenda

  1. Architectural discussion : single entry point for services
    1. looking for 1 entry point to all of the microservices, discussed briefly how today smi services leverages zuul.   Should this also be considered for on-taskgraph and other future RackHD services?  
    2. currently smi services are also using the workflow engine as an entry point to those services.  Looking to eliminate the smi/workflow engine dependency and leverage a standard api gateway.
      1.  Zuul leverage for smi services as it met the smi service requirements
      2. Recommend looking at other technologies (including ngnix)
      3. Requirements: generate custom filters , load the config from a key-value store
      4. Generate a list of features for these technologies including a POC (how it would work with smi services and node.js)
      5. AI: Amy Mullinsto create epic and initial spike story(ies).  Assignment can be coordinated through the managers.  Recommending to go with a dev that knows the apis very well (redfish, 2.0)
        1. expectation is to have a report out on the various tech. including poc.
        2. Priority on RackHD backlog? AI: Thomas Sullivan to help determine where rackHD epics/racs are now prioritized.
  2. RackHD Release Cadence
    1. As we’re moving in to continuous delivery for the Concourse based CI (ie, deployed packages per merged PR), does that change the need or frequency for weekly RackHD sprint releases?  Email thread started on 9/20.
      1. QRB discussion on 9/27 included having PR quality gates/Post Merge testing on virtual hardware and updating the weekly Sprint Release to run on physical hardware (Baremetal-Regression)
      2. Release process suggestions:
        1. Continuous deployment always overriding devel (debian) , latest (docker)
        2. Dockerhub not updating RackHD files tagged devel
    2. If we are releasing debians and docker containers AND the demo is moved to a docker image, do we still need to provide a script in the new CI env that allows users to generate a Vagrant based RackHD image
      1. 2 use cases:  Vagrant based demo, Vagrant based dev environment
        1. Vagrant based demo becomes obsolete with the docker-compose based demo
        2. Vagrant based dev env also becomes obsolete with the docker-compose based demo (or other means/variations)


Did not get to these topics today:

  1. Review slides from Former user (Deleted) for CI Security moving CI to container , moving CI to cloud .
    1. AI: CC team to review the slides, come back with feedback/answers to the questions posed in the slide deck.
  2. RackHD Documentation 
    1. Per the Slack channel: 

      3lm0 [11:42 AM] 
      @maithri indeed, the blockers for now are the UI/UX for our end users which aren't technical one, and the consistency among the documentation leading to a difficult installation process with various pitfalls to be aware of plus the fact that as operator you have to deal with too many configuration and manual operations.


  3. On-web-ui long term plan
    1. AI: Amy Mullins to send email on the topic to get the discussion started


Moved to email discussion:

RackHD Tooling Updates

  1. Ubuntu to be upgraded to 16.04
    1.   What has been developed to date for the Concouse env includes the 16.04 migration, should Jenkins based env be upgraded?
    2. ova scripts will need to be updated (passing a parameter) to move to 16.04 (covered by Felouka:  RAC-5987 - Create tooling that allows users to generate an OVA based RackHD image IN PROGRESS ) 
  2. Node v6 is the current available version, RackHD is running v4.
    1. RackHD Epic to be created to migrate from v4→ v8
      1. Needs to be assigned.  Former user (Deleted)/Maglev team to help create the epic.
  3. RackHD Story tracking testing the latest MongoDB version in CI (Mongo recommending using 3.X + versions only, not supporting anything in the 2.X version family)
    1. Do we want to support this in Jenkins, Concourse, or both.  Will be part of the Concourse env.
      1. Concourse env tracking story:  RAC-5991 - Placeholder - Mongodb running the latest version in test BACKLOG
      2. Jenkins side, seems to be a trivial effort to support.  May bring out issues in RackHD and if previous issues have since been resolved.
        1. Plan to try testing with the latest, see what the issues are.  If trivial set of isues, move to the latest.  If many issues encountered, hold off.
        2. Maglev team to create the story and target next sprint. - any update? 

Recently closed out:

Ease of use of the Vagrant based demo

  1. Week of 8/7 comment on the slack channel: "At this point I've given up on rackhd. If even the demo requires an old version of ubuntu to run an old version of virtualbox to get it working, I will stick with something simpler."
    1. Requirements agreed to  by the team:   
      1. The demo should be simple, convenient, easy to use / bring up and debug
      2. The environments should be workable across different versions or latest versions. Such as mongodb, docker/virtualbox, 
      3. Host OS independent, Can run on Windows, Linux, or MacOS host system.
      4. Has no impact / dependency of the host network.
      5. utilizes existing nightly RackHD images, does not require building / testing of additional new images
      6.  Uses infrasim for vmbc nodes
      7. can run discovery workflows
      8. can run OS install workflows
      9. can run FIT smoke test suite
      10. The demo solution could support running in cloud (ex: IaaS, PaaS) technically
      11. to include the smi microservice containers  
    2. CC team voted for the "docker - compose " POC effort: https://github.com/RackHD/RackHD/pull/889
      1. RAC-6226 - Getting issue details... STATUS

How to add stand alone services to the Master CI/CD pipeline (ex SMI Micro Services, UCS etc)  right now Master CI is strictly core RackHD

  1. Status of on-network/on-topology and test/deployment options
    1. New services should follow 12factor.net guidelines, Rest API should be available for IPC between services
  2. SMI Service Integration to CI
    1. Former user (Deleted) has downloaded the idrac simulation tool, currently under evaluation
      1. tool supports only read operations
      2. RackHD Epic to be created that introduces workflow testing to rackhd CI/CD.  This will cover smi service testing, does not cover "plugin" integration tsting 
    2.  the idrac simulation tool will be used for virtualized testing (PR quality gates and post merge testing) and introduce the 13g Dell physical hardware to the Regression-Baremetal job for smi workflow testing for regression test..   

    3. Michael Hepfer and Former user (Deleted) to sync up offline to stand up a concourse environment 


Process change for Master CI failures - how long can a developer work on a fix for a Master CI Failure before requiring to back out the change and get back to green?

  1. have your 1 working day to resolve the issue (ie, up to the next MasterCI run at 6:31pm EST) 
  2. if not resolved, code is backed out, the MasterCI job is re-run to ensure that pipeline is returned to Green.
  3. If thought to be resolved, MasterCI to be re-run to ensure the pipeline returns to green


Racadm→WSMAN tooling conversion

  1. Agreement at OLT that we will be going fully wsman-based and eliminate racadm from workflow support.  


Next meeting will be Thursday October 5.