Core Committer Weekly Interlock - September 14th 2017

Attendees

Former user (Deleted)

Former user (Deleted)

Former user (Deleted)

Ahmed Osama

Bishoy youssef

Former user (Deleted)

Former user (Deleted)

Thomas Sullivan

Former user (Deleted)

Michael Hepfer

Tim Larson

Former user (Deleted)

Erik Smith

Leo Zhang

Former user (Deleted)


Agenda

  1. Ease of use of the Vagrant based demo
    1. Week of 8/7 comment on the slack channel: "At this point I've given up on rackhd. If even the demo requires an old version of ubuntu to run an old version of virtualbox to get it working, I will stick with something simpler."
      1. Requirements agreed to  by the team:   
        1. The demo should be simple, convenient, easy to use / bring up and debug
        2. The environments should be workable across different versions or latest versions. Such as mongodb, docker/virtualbox, 
        3. Host OS independent, Can run on Windows, Linux, or MacOS host system.
        4. Has no impact / dependency of the host network.
        5. utilizes existing nightly RackHD images, does not require building / testing of additional new images
        6.  Uses infrasim for vmbc nodes
        7. can run discovery workflows
        8. can run OS install workflows
        9. can run FIT smoke test suite
        10. The demo solution could support running in cloud (ex: IaaS, PaaS) technically
        11. to include the smi microservice containers  
      2. Review POC from Former user (Deleted)
        1. Ran 2 scripts (1 to get rackhd services up, 1 to get infrasim started)
        2. RackHD code up to date as of the prior sprint release, Infrasim version locked down.
        3. look in to ways for re-using the existing config.json file
        4. https://github.com/RackHD/RackHD/pull/889
      3. Review POC from Former user (Deleted)n
        1. https://github.com/RackHD/RackHD/pull/857
        2. RackHD and Infrasim run from source inside a single docker container
      4. AI for the team:  review both POC, next meeting we will identify which one we move forward with
  2. RackHD Tooling Updates
    1. Ubuntu to be upgraded to 16.04, 18 to be released ~ April 2018
      1. RackHD epic / Concourse KI to include migration of the CI environment move to Ubuntu 16.04.  What has been developed to date for the Concouse env includes the 16.04 migration.
        1. Former user (Deleted) to provide details on what is available.
      2. ova scripts will need to be updated (passing a parameter) to move to 16.04
    2. Node v6 is the current available version, RackHD is running v4.
      1. v4 will be EOL 4/18
      2. v8 releases next month 
      3. RackHD Epic to be created to migrate from v4→ v8
        1. Needs to be assigned.  Former user (Deleted)/Maglev team to help create the epic.
    3. RackHD Epic to be created such that the CI env is testing the latest MongoDB version (Mongo recommending using 3.X + versions only, not supporting anything in the 2.X version family)
      1. Do we want to support this in Jenkins, Concourse, or both.  Will be part of the Concourse env.
      2. Jenkins side, seems to be a trivial effort to support.  May bring out issues in RackHD and if previous issues have since been resolved.
        1. Plan to try testing with the latest, see what the issues are.  If trivial set of isues, move to the latest.  If many issues encountered, hold off.
        2. Maglev team to create the story and target next sprint.
  3. Process change for Master CI failures - how long can a developer work on a fix for a Master CI Failure before requiring to back out the change and get back to green?
    1. have your 1 working day to resolve the issue (ie, up to the next MasterCI run at 6:31pm EST) 
    2. if not resolved, code is backed out, the MasterCI job is re-run to ensure that pipeline is returned to Green.
    3. If thought to be resolved, MasterCI to be re-run to ensure the pipeline returns to green

Did not get to the agenda items below:

  1. RackHD Release Cadence

    1. As we’re moving in to continuous delivery for the Concourse based CI (ie, deployed packages per merged PR), does that change the need or frequency for weekly RackHD sprint releases?  

  2. Review slides from Former user (Deleted) for CI Security moving CI to container , moving CI to cloud .

    1. AI: CC team to review the slides, come back with feedback/answers to the questions posed in the slide deck.
  3. From QRB meeting notes
    1. Agreement at OLT that we will be going fully wsman-based and eliminate racadm from workflow support.  
      1. Content finalized,  Leo Zhang / Maglev team will work with Thomas Sullivan on generating the official  KI.
      2. Test plan: the idrac simulation tool will be used for virtualized testing (PR quality gates/MasterCI) and introduce more Dell physical hardware to the Regression-Baremetal job.   

        1. Former user (Deleted) has downloaded the idrac simulation tool, currently under evaluation
          1. tool supports only read operations
          2. RackHD Epic to be created that introduces workflow testing to rackhd CI/CD.  This will cover smi service testing, does not cover "plugin" integration tsting 
  4. BareMetal Regression Pipeline now created/monitored. 
    1. Plan is to monitor for a few weeks, should it then be a gate?
    2. BareMetal OS install on real hardware currently runs every 2 hours on the nightly docker images.  Will need to kick off BareMetal at same time as CI
    3. Do we then continue to run BareMetal every 2 hours
    4. Should this be part of the Master CI pipeline, if so then we would need a modification to the Merge Freeze tool to freeze on failure of BareMetal regression tests.
  5. All SMI Services  have been published.  
    1. What documentation is needed, what kind of communication is needed for the open source community?
  6. How to add stand alone services to the Master CI/CD pipeline (ex SMI Micro Services, UCS etc)  right now Master CI is strictly core RackHD



Next meeting will be Thursday September 21.