Core Committer Weekly Interlock - October 5th 2017

Attendees

Former user (Deleted)

Michael Hepfer

Bishoy youssef

Former user (Deleted)



Agenda

  1. Architectural discussion : single entry point for services  - AI who is going to do the work?   Where does it fit in regards to priorities
    1. looking for 1 entry point to all of the microservices, discussed briefly how today smi services leverages zuul.   Should this also be considered for on-taskgraph and other future RackHD services?  
    2. currently smi services are also using the workflow engine as an entry point to those services.  Looking to eliminate the smi/workflow engine dependency and leverage a standard api gateway.
      1.  Zuul leverage for smi services as it met the smi service requirements
      2. Recommend looking at other technologies (including ngnix)
      3. Requirements: generate custom filters , load the config from a key-value store
      4. Generate a list of features for these technologies including a POC (how it would work with smi services and node.js)
      5. AI: Amy Mullinsto create epic and initial spike story(ies).  Assignment can be coordinated through the managers.  Recommending to go with a dev that knows the apis very well (redfish, 2.0)
        1. expectation is to have a report out on the various tech. including poc.
        2. Priority on RackHD backlog? AI: Thomas Sullivan to help determine where rackHD epics/racs are now prioritized.
  2. RackHD Release Cadence  - AI how many nightly tags should we keep?   Leo Zhang with this nightly tag do we handle the concerns around traceability?  AI Former user (Deleted) how are we coming with the bare metal regression testing on the dell stack on Concourse?
    1. As we’re moving in to continuous delivery for the Concourse based CI (ie, deployed packages per merged PR), does that change the need or frequency for weekly RackHD sprint releases?  Email thread started on 9/20.
      1. QRB discussion on 9/27 included having PR quality gates/Post Merge testing on virtual hardware and updating the weekly Sprint Release to run on physical hardware (Baremetal-Regression)
      2. Release process email thread status:
        1. We would be able to maintain traceability with what we have today as part of the nightly build.  The difference being instead of a time based forced build around 9:30PM EST, each PR merged would trigger the build of only the modified repos.  Tag names will change from “nightly” to “devel”. Our nightly builds are overwritten today.  That is another slight difference where they would be overwritten upon PR merge for modified repos.  Only modified repos would be updated, not all repos (even ones that haven’t changed).

           Continuous deployment gives us the real time testing (FIT and deployment) that we previously had with MasterCI.  Now you don’t need to wait for 9:30pm for failures. 

          The concept of a weekly Sprint release will be preserved, however it will be changed to run the baremetal regression job (ie, testing on physical HW).  Since that job is In process of being converted to concourse, some of the automation of the release process will also need to be updated.  There are also ongoing discussions of versioning and looking for efficiencies/improvements on how the packages are deployed (they are currently being republished even if there is no change). 

    2. If we are releasing debians and docker containers AND the demo is moved to a docker image, do we still need to provide a script in the new CI env that allows users to generate a Vagrant based RackHD image
      1. 9/28 CC discussion concluded that the Vagrant demo env and dev env become obsolete with the docker compose based demo  - AI Former user (Deleted) can you remove the building of Vagrant from the demo?
  3. Review slides from Former user (Deleted) for CI Security moving CI to container , moving CI to cloud .
    1. AI: CC team to review the slides, come back with feedback/answers to the questions posed in the slide deck.
  4. RackHD Documentation A
    1. Per the Slack channel:  3lm0 [11:42 AM] 

    @maithri indeed, the blockers for now are the UI/UX for our end users which aren't technical one, and the consistency among the documentation leading to a difficult installation process with various pitfalls to be aware of plus the fact that as operator you have to deal with too many configuration and manual operations.  AI Michael Hepfer to work on slack documentation issues.   

  5. On-web-ui long term plan

    1. AI: Amy Mullins to send email on the topic to get the discussion started


Moved to email discussion:

RackHD Tooling Updates

  1. The RackHD Tooling Update work is tracked  by 2 EPIC: RAC-6146   &  RAC-6132


    1. Ubuntu 16.04 :
      • Jenkins Function Test has been switched to 16.04 (RAC-6137)
        • Src code and deb package are both supported now. (RAC-6135, RAC-6136)
        • Released docker upgrade to 16.04 (RAC-6133 ) –lower priority
        • Released OVA/vagrant upgrade to 16.04 RAC-6134 - lower priority
        • Upgrade Jenkins slave from 14.04 to 16.04 RAC-6138  - lower priority
    2. RabbitMQ (RAC-6207, RAC-6208)  : To Do
    3. MongoDB Update( RAC-6144,RAC-6145, RAC-6147) : ongoing. so far so good.
    4. Node JS 6/8 upgrade
      1. Unit-Test pass  (RAC-6232, RAC-6243),  Enable Travis CI Node 6/8 as PR Gate (RAC-6235)
      2. RackHD function test pass (RAC-6234, RAC-6244)
      3. Jenkins pipeline regression (RAC-6236, RAC-6237, RAC-6238)

Recently closed out: