Core Committer Weekly Interlock - October 12th 2017

Attendees

Former user (Deleted)

Former user (Deleted)

Former user (Deleted)

Former user (Deleted)

Bishoy youssef

Leo Zhang

Former user (Deleted)

Michael Hepfer

Former user (Deleted)

Amy Mullins

JP

Agenda

  1. RackHD Documentation A
    1. Per the Slack channel:  3lm0 [11:42 AM] 

    @maithri indeed, the blockers for now are the UI/UX for our end users which aren't technical one, and the consistency among the documentation leading to a difficult installation process with various pitfalls to be aware of plus the fact that as operator you have to deal with too many configuration and manual operations.  AI Michael Hepfer to send Jira ticket info.  Former user (Deleted) Can the documentation story be completed by Mustang team?  The node packaging story to be discussed outside of this meeting in email.

  2. On-web-ui long term plan

    1. Maintain as-is until Kastalist UI is required.  
      1. Critical bug fix and API alignment only
        1. Former user (Deleted) to create PRs for the known issues worked on with Centurylink.
        2. Former user (Deleted) to provide list of outstanding Jira bugs to determine if they are "critical" or can be set to p3.
      2. Continue to deploy as part of RackHD, but do not build up CI test capabilities. 
  3. Architectural discussion : single entry point for services  - AI who is going to do the work?   Where does it fit in regards to priorities.  As of 10/12, no update. 
    1. looking for 1 entry point to all of the microservices, discussed briefly how today smi services leverages zuul.   Should this also be considered for on-taskgraph and other future RackHD services?  
    2. currently smi services are also using the workflow engine as an entry point to those services.  Looking to eliminate the smi/workflow engine dependency and leverage a standard api gateway.
      1.  Zuul leverage for smi services as it met the smi service requirements
      2. Recommend looking at other technologies (including ngnix)
      3. Requirements: generate custom filters , load the config from a key-value store
      4. Generate a list of features for these technologies including a POC (how it would work with smi services and node.js)
      5. AI: Amy Mullinsto create epic and initial spike story(ies).  Assignment can be coordinated through the managers.  Recommending to go with a dev that knows the apis very well (redfish, 2.0)
        1. expectation is to have a report out on the various tech. including poc.
        2. Priority on RackHD backlog? AI: Thomas Sullivan to help determine where rackHD epics/racs are now prioritized.
  4. RackHD Release Cadence  - AI how many nightly tags should we keep?   Leo Zhang with this nightly tag do we handle the concerns around traceability?  AI Former user (Deleted) how are we coming with the bare metal regression testing on the dell stack on Concourse?
    1. 10/27 is still the target date to bring the Concourse based PR quality gates and Post Merge online.  
      1. Dockerhub updates:
        1. devel will reflect the latest merged PR (always overwritten)
        2. latest will reflect the latest Sprint Release
        3. nightly will be deleted, covered by devel
        4. preserve the releases (2.5.0, 2.10.0, ....)
        5. remove everything else. 
          1. There was a concern for Traceability: we can recreate image via the digest of the built docker image that references the git commit hash.  If, in the rare case, an external contributor is looking for an engineering build, then they can make a request for it or rebuild it themselves.  If we find there is an excessive amount of requests for rebuilding engineering drops, we can explore how best to preserve them.  The effort to support this in our CI pipeline is believed to be greater than effort to recreate the image on the rare occasion.
          2. All the builds are also saved locally.   
      2. Debian updates: 
        1. NA, will plan to re-use what is already there.
    2. Sprint Release cadence will still be maintained on a weekly basis and will change from running virtually to running on physical HW (regression-baremetal job).
      1. Discussed in QRB meeting on 10/11/17:  Regression-baremetal job is not yet stable.  Ongoing efforts this week to stabilize that job.  Will review the test results and attempt to set a target date for when that can go online. 
    3. Add post meeting: Regression-baremetal job will be set up to run Sun-Thurs nights to mitigate risk and minimize surprises for the Friday release.
      1. Epic created to track the Sprint Release/Regression test efforts: RAC-6326 - Getting issue details... STATUS
    4. If we are releasing debians and docker containers AND the demo is moved to a docker image, do we still need to provide a script in the new CI env that allows users to generate a Vagrant based RackHD image
      1. 9/28 CC discussion concluded that the Vagrant demo env and dev env become obsolete with the docker compose based demo  - AI Former user (Deleted) can you remove the building of Vagrant from the demo?.  Update 10/12:  Will be done by the Veyron team as part of bringing the docker based demo online.
  5. Review slides from Former user (Deleted) for CI Security moving CI to container , moving CI to cloud .
    1. AI: CC team to review the slides,  compare and discuss with the ongoing dialog leveraging Virtustream between Felouka and Veryon.