azure / review-checklists Goto Github PK

View Code? Open in Web Editor NEW

1.1K 52.0 300.0 122.49 MB

This repo contains code and examples to operationalize Azure review checklists.

License: MIT License

VBA 14.78% Shell 9.28% Dockerfile 0.30% Python 51.37% HTML 3.55% PowerShell 1.19% Bicep 3.30% Visual Basic .NET 16.21%

review-checklists's Introduction

Azure Review Checklists

Quick links for using the checklists in this repo:

Latest release of the Excel spreadsheet
https://aka.ms/ftaaas for a web frontend (check out our sister repo https://github.com/Azure/fta-aas).

Summary of checklists supported and the respective responsible owners:

Checklist	Status	CodeOwners
ALZ	GA	FTA-ALZ-vTeam, ALZ-checklist-contributors
WAF	GA	Dynamically generated
AKS	GA	@msftnadavbh @seenu433 @erjosito
ARO	Preview	@msftnadavbh @naioja @erjosito
AVD	GA	@igorpag @mikewarr @bagwyth
Cost	GA	@brmoreir @pea-ms
Multitenancy	GA	@arsenvlad @cherchyk
Application Delivery Networking	GA	@erjosito @andredewes
AVS design	Preview	@fskelly @mgodfrey50 @robinher
AVS implementation	Preview	@fskelly @mgodfrey50 @robinher
SAP	GA	@NaokiIgarashi @AlastairMorrison @mottach
API Management	Preview	@andredewes @seenu433
Stack HCI	Preview	@mbrat2005 @steveswalwell @igomaa
Spring Apps	Preview	@bappadityams @vermegi @fmustaf
Azure DevOps	Preview	@roshair
SQL Migration	Preview	@karthikyella @dbabulldog-repo
Security	Deprecated	@mgodfrey50 @rudneir2

What is an Azure Design Review?

A common request of many organisations, starting with the public cloud, is to have their design double-checked to make sure that best practices are being followed. The scope of this exercise could vary, from generic Azure landing zones to workload-specific deployments.

When doing Azure design reviews (or any review for that matter), Microsoft employees and Microsoft partners often leverage Excel spreadsheets as the medium of choice to document findings and track design improvements and recommendations. A problem with Excel spreadsheets is that they are not easily subject to revision control. Additionally, team collaboration with branching, issues, pull requests, reviews, and others is difficult at best—impossible in most cases.

Why this repository?

This repo separates the actual review checklist content from the presentation layer so that the JSON-formatted checklist can be subject to version control, and it can then be imported into an Excel spreadsheet by means of Visual Basic for Applications (VBA) macros for easier handling (not all of us like working natively with JSON). The provided Checklist Review Spreadsheet leverages code to interpret JSON from the VBA module in https://github.com/VBA-tools/VBA-JSON/, from which there is a copy in this repo to be self-contained (make sure you use the latest version though). The Checklist Review Spreadsheet includes some macros (find the source code both in the spreadsheet as well as here), which are accessible from control buttons in the main sheet.

Note: the VBA code in the spreadsheet does not work on Excel for Mac, due to some critical missing libraries.

Additionally, a Github action in this repository translates after every commit the English version of the checklist to additional languages (Japanese, Korean, Spanish, and Brazilian Portuguese), using the cognitive service Azure Translator. See an example of a translated checklist in aks_checklist.ja.json

Reporting errors and contributing

Please feel free to open an issue or create a PR if you find any error or missing information in the checklists, following the Contributing guidelines

Using the spreadsheet for Azure reviews

Download the Excel spreadsheet from the latest release to your PC
Use the dropdown lists to select the technology and language you would like to do your review

Click the control button "Import latest checklist". After you accept the verification message, the spreadsheet will load the latest version of the selected technology and language
(Optional) If you are going to distribute the spreadsheet to users who cannot work with macros (for example, either because of security reasons or because they use Office for Mac), save a version of the spreadsheet in xlsx format (instead of xlsm). Note that disabling macros will result in the spreadsheet losing its ability to import updated versions of the checklist or JSON-based Azure Resource Graph query results
Go row by row, set the "Status" field to one of the available options, and write any remarks in the "Comments" field (such as why a recommendation is not relevant, or who will fix the open item)
1. Since there are many rows in a review, it is recommended to proceed in chunks: either going area after area (first "Networking", then "Security", etc) or starting with the "High" priority elements and afterward moving down to "Medium" and "Low"
2. If any recommendation is not clear, there is a "More Info" link with more context information.
3. IMPORTANT: design decisions are not a checkbox exercise, but a series of compromises. It is OK to deviate from certain recommendations if the implications are clear (for example, sacrificing security with operational simplicity or lower cost for non-critical applications)
Check the "Dashboard" worksheet for a graphical representation of the review progress

Security settings running macros

There are some settings that you might need to change in your system to run macro-enabled Excel spreadsheets. When initially opening the file you may see the following error, which prevents Excel from loading:

Excel cannot open the file 'review_checklist.xlsm' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.

In other cases, the file opens with the following message, which prevents you from being able to load the checklist items:

Unblock the file or add an exception to Windows Security

You might need to unblock the file from the file properties in the Windows File Explorer so that you can use the macros required to import the checklist content from github.com:

Additionally, you might want to add the macro-enabled spreadsheet file to the list of exceptions in Windows Security (in the Virus & Threat Protection section):

Using the spreadsheet to generate JSON checklist files (advanced)

If you wish to do contributions to the checklists, one option is the following:

Load up the latest version of the checklist you want to modify
Do the required modifications to the checklist items
Push the button "Export checklist to JSON" in the "Advanced" section of controls in the checklist. Store your file in your local file system, and upload it to the checklists folder of this Github repo (use the format <technology>_checklist.en.json, for example, lz_checklist.en.json)
This will create a PR and will be reviewed by the corresponding approvers.

Using Azure Resource Graph to verify Azure environments (advanced)

Some of the checks have associated Azure Resource Graph queries, which return a list of related resources and a compliance status for each. Resource Graph queries enable objective verification of the associated checks and make filling out the spreadsheet easier by collecting some environment details for you.

Along with the spreadsheet, this repo includes the script checklist_graph.sh. This script will run the graph queries stored in the JSON checklists and produce an output that can easily be copied and pasted into the spreadsheet, or alternatively generate a JSON file that can then be imported to the spreadsheet.

See the checklist_graph.sh README file for more information about how to use checklist_graph.sh.

Disclaimer

This is not official Microsoft documentation or software.
This is not an endorsement or a sign-off of an architecture or a design.
This code sample is provided "AS IT IS" without warranty of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and/or fitness for a particular purpose.
This sample is not supported under any Microsoft standard support program or service.
Microsoft further disclaims all implied warranties, including, without limitation, any implied warranties of merchantability or fitness for a particular purpose.
The entire risk arising out of the use or performance of the sample and documentation remains with you.
In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the script be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample or documentation, even if Microsoft has been advised of the possibility of such damages

review-checklists's People

Contributors

Stargazers

Watchers

Forkers

huberthk jasonbeck2 jiyongseong gordonby danielmeixner lukemurraynz jasoncard beatrizsilv yosanbe sasukeh vermegi jonathan-vella kbk574 mbrat2005 joanabmartins infosatheesh2020 nielsophey huangyingting rebuildtech konugautham 0-1-2-3-4 stackjackson aniltexascowboy upendra25312 x0techdad azure4arjun florian-ried jamelachahbar-zz msftnadavbh yachoukh amitdash abekifle rafnijs onlykumarabhishek ravibak ehloworldio dp-miller robert-bakker ercex n8london subhashc brmoreir davidarayasanabria chunliu hjgraca sdolgin varma31 jelledruyts acmo21 gethuravi2020 fidelcasto fskelly cheahengsoon shivanandsingh hatemalsum anders-kristiansen 16rookie16 elineherbert amit-kumar79 videshmukh hiroyay-ms talibj ltkvien naioja kevhwang dmelgarejopdesgo srini-thumala azmiruddin kaspanitz rackerjon kemiwilliam ievsantillan isantillan1 tgaadmin arvindmits koolacac rakhilravindran007 linors sangling shabbirvanelly mysterysommy elsamoheg vanthanvu 0nopnop arunkrrai abdennebi-forks rafiinamdar-ops goldeneagle16 cloud-studio-repository-org sunchippss claudioferejes shivamsingh1441 johndowns tellstrom paullisto-insight nate8523 mofaizal mklifman roopeshnair sajipoochira

review-checklists's Issues

Create formal releases where we can have change history and notifications

Using GitHub Releases can be a great way of "capturing" a point in time version that can be distributed formally. Also enables notifications for those tracking the project.

Clarification on the rationale behind Governance recommendation.

Under Security Governance and Compliance:
'Use Azure Policy to control resource provider registrations at the subscription and/or management group levels'

In a recent call with a customer the feedback we got about this guideline was that the rationale behind it wasn't very clear and that it was too specific. Plus, it seems like there is no actual built-in policy to achieve this, so customers would need to create it from scratch.

Improve prioritization of LZ checklist

Existing LZ checklist has most of the items classified as "Medium". We should probably objectively decide what the three different levels mean, and reclassify the checks.

For example, as description of the levels something like:

High: mandatory recommendation, regardless of customer requirements
Medium: this recommendation can have a large impact in the design, but it is not necessarily mandatory
Low: nice to have

After the meaning of the priorities is decided, we should reprioritize the different checks of the LZ checklist.

Missing "Comment" attribute in the AVD JSON checklist

In the original format of AVD XLSX checklist, we populated a column "Comment" with useful information.
There is value in reporting that content, then I would recommend to take back the content from the original XLSZ file and use to add an additional attribute to each JSON fragment.

Translate Github action too limited

The existing Github action has some limitations:

It doesnt support the category parameter for Azure Translator, hence it will not work with custom translators
It doesnt support specifying a list of JSON keys that should not be translated (such as the Azure Resource Graph queries)
It changes the JSON structure, and transforms arrays into dictionaries, which makes processing of English and non-English checklists inconsistent

Issues have been opened in the original Github action, but it doesnt look like they will be resolved any time soon. Hence, a workaround might be create our own Github action that runs the translation supporting the missing features described above.

Consider generating the "latest and greatest" Excel files from the json from a GitHub actions pipeline

VBA macros or Office addins are a (slight) burden on users of the Excel, perhaps generating the latest version of the Excel (in multiple languages for different technologies) could simplify "consumption" of the Excel sheet?

Scenarios where Macros are disabled.

Perhaps we could add a note that Macros are required and if customer dont allow or have it disabled, they could wait for web based version which is in progress ?

Aplication layout / architecture to be created for Container pieces

Creation of a mini Architecture diagram

Add more Azure Resource Graph queries to LZ checklist

Depending on #181 , since we probably want to start with high-prio items. Today we are in a pretty good shape in the AKS checklist regarding ARG queries, but the LZ checklist is still pretty short. The Contributing guide has some information on how to add queries: https://github.com/Azure/review-checklists/blob/main/CONTRIBUTING.md#adding-resource-graph-queries

Graph report not scalable

The existing script checklist_graph.sh produces output that can be inserted into the "Comments" field of the review spreadsheet, but it is not scalable:

The compliant/non-compliant resources are marked with the short syntax /, for readability purposes. This is fine for small environments, but normally you would want the full ARM ID of the resources
The script doesnt support running the queries over multiple subscriptions, or scoped to a Management Group

Proposed solution:

Modify queries so that they additionally give back an ARM ID
Change the checklist_graph.sh script to give output in a "compact" format (current) or "extended" (including the full resource ID)
Include the possibility of scoping the queries to multiple subscriptions or to Management groups
The "extended output" might take the form of a CSV/JSON file with these columns: ResourceId, Pass/Fail, checkGUID, checkText (optional). This way it could be read independently of the spreadsheet

checklist_graph.sh

Hi Jose,

I can see the intent & massive value of being able to check for these values/settings programmatically and would love to help.
As much as I can get this to work correctly as stated in the README in the Az CLI - however I'm not sure what it's doing?

Where can I find more info or an example of how to get this working in checking "specific" components?
Or am I not understanding how the GUID's work?

Eager to help make this work

DaveC

Customer feedback regarding wording in the checklist

Modifying a wording of a few items within the checklist based upon customer feedback,
Questions or statements should be more clear.

Assessment platform and WAF alignment

It is highly likely that these efforts around review checklists would be better invested within the assessment platform and WAF service checklists, rather than an independent github repository since it ultimately represents a duplication of effort, content, and presents a disjointed customer experience overall. The content organisation (Dom Allen) will be able to partner with you to better integrate this content, and align things more closely with collective efforts in this space.

Duplicate content in `samples` folder

sample folder to be removed when links from ftawiki are updated

Capture more detailed usage metrics

To better understand how often the review checklists are being used, we should track the download of JSON files in detail. These metrics can be used to detect trends in usage, which is valuable feedback.

We'd like to know:

How many times did each JSON file get imported into the worksheet per month?
Are there any increases month over month in requests for landing zone, security, avd, etc?
Are there any anomalies in JSON file requests or increases in static XLSX file requests that indicate a problem with the macros?

A few ways we might do this...

Update the GH actions workflow to publish new release versions of the JSON files with every merge to main. The release tracking / insights in GH would provide tracking. This is not ideal because it would require a new release to be published (automating multiple steps) with EVERY merge to main.
On merge, publish the JSON files to blob storage with metrics logged. We could then build monitoring workbooks on the data.
Build custom telemetry calls into the VBA code to call a logging API when a JSON file is being imported.

My instinct is that option 2 is the right path forward. In all 3 cases, the macros would need to be updated in some way to account for tracking the download of JSON files.

Translation action modifying GUIDs

For example, if you look at PR #153 you can see that the translator tried to modify the GUIDs of some of the items. It translates as well the Graph queries, which is obviously wrong.

How to generate the graph result from Azure

Hi Team,
May I ask about to generate the graph result from Azure?
What is the Query needed to Write to match with the review checklist and convert to json?

Thank you

Status drop box is not available after line item 153 in the checklist.

In the landing zone checklist the Status drop box which has three options (Open, Fulfilled and NA) is not available after the Line item 153 which needs to be fixed.

Review wording of LZ checklist recommendations

Depending on #181 (since we should start with high prio items), we might want to review the wording of the different checks in the LZ checklist, since some of them could be rephrased more clearly.

HTML-based presentation layer

@seenu433 Thanks for your suggestion on building an HTML layer!

My thought is to consider a json to html formatting (something like this) so that it could be all within

It would be fantastic if we could create a static HTML "app", ideally that allowed users not only to view the table, but to fill it in and save the result somewhere.

Use custom translators

Potentially an Azure Custom Translator could be used to improve the translation quality

Translators break JSON structure

Translators break JSON structure of the original file. They could be fixed by running this jq query:

cat ./aks_checklist.es.json | jq '. |= . + {"severities": [ .severities[] ] } + {"status": [ .status[] ] } + {"categories": [ .categories[] ] } + {"items": [ .items[] ] }'

Clarify how to add a new item in the JSON file and the role of GUID

@erjosito correct me if I'm wrong and missed any part, but there is no clear step-by-step on how to add a new element in a checklist JSON file. I remember when you explained me the role of GUID attribute, and how to generate using one of the 1,000,000 tool/script, so trivial ok, but that piece should be documented somewhere, right?

Error on download

Error when downloading the file

"Excel cannot open the file 'review_****.xlsm' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file."

Multiple people, multiple locations.

Consider building on top of "Microsoft Assessments" (like Well Architected Review)?

See https://docs.microsoft.com/en-us/assessments/. This way customers can also self-serve? Checklists JSON could possibly be consumed from (or pushed to) that platform?

Incomplete recommendations

Hello,

The following two recommendations are missing some details:

Network Topology and Connectivity -> Virtual WAN:
Recommendation 50 simply says 'Follow the principle'.

Network Topology and Connectivity -> Hub and spoke
Recommendation 56 says 'Consider a network design based on the traditional hub-and-spoke network topology for the following scenarios', but doesn't actually specify the scenarios.

Statement/Disclaimer around sign-off or endorsement of a landing zone/design document

Should we add a statement in welcome page or in disclaimer stating that these spreadsheets/reviews are by no mean a signoff/endorsement of customer/partners landing zone or any other designs ?

VBA not working in Excel for MacOS

The VBA code is not working for Mac OS X. There are some minor issues (like environment variables having different names, but the main problem is that the "Microsoft Scripting Runtime" is apparently not available in Office for Mac. This runtime is required to work with objects like dictionaries, which are required to process the JSON code.

Configure "Wrap text" in "Checklist item" column

Today the elements in the Checklist item column can extend into the next column (Description). The good thing about that is that for the LZ and AKS checklists (which dont have a Description field today), the "Checklist item" text can extend and cover the description cell, which makes it more readable. The downside is that engineers might be confused and fill in the description field instead of the item field.

To prevent that from happening, "Wrap text" could be enabled in the "Checklist item", which would prevent flooding. Any opinion about what would be better from a readability perspective?

Language Support for Brazilian Portuguese

LZ checklist - Wrap experience not consistent amongst items

While most of the items are not wrapped in the LZ list, some of them are :

Ensure, for new large or global network deployments in Azure where you need global transit connectivity across Azure regions and on-premises locations, use virtual WAN

Use a Virtual WAN hub per Azure region to connect multiple landing zones together across Azure regions via a common global Azure Virtual WAN.

When low latency is required, or throughput from on-premises to Azure must be greater than 10 Gbps, enable FastPath to bypass the ExpressRoute gateway from the data path.

Provision Azure Key Vault with the soft delete and purge policies enabled to allow retention protection for deleted objects.

etc

This makes it a bit confusing to read.

Design - Import button hard to spot

I actually had a hard time to see the import button -_-‘
The advanced one’s pop so much to the eyes and because of that the one in the easy box is easy to miss, I would suggest grouping the buttons all together

Wording "enforce the use of RI"

This recommendation "Enforce the use of reserved instances to prioritize reserved capacity in required regions. Then the workload will have the required capacity even when there's a high demand for that resource in a specific region"

Seems to suggesting there is a way to "Enforce" RI using things like Azure Policy, is there any ?
If not, we should remove the word enforce in my opinion.

Language Support for Spanish

New data structure to support combined ARG Query

Can we please look at adding an additional column to support "combined" compliant and non-compliant queries.
@erjosito is aware of the ask :)

Support.md needs to be reviewed and updated for public audience

https://github.com/Azure/review-checklists/blob/main/SUPPORT.md is not updated

ARG - Add graph query for 'Enforce or appended resource tags through Azure Policy'

Add an ARG query for the LZ checklist item 'Enforce or appended resource tags through Azure Policy'.

Write the query to determine whether Policy Assignments exist at the graph query scope for Policies which require or modify resource tags. If assignments exist, return a list of assignment IDs and compliance value '1'

@fskelly
Still testing this, I'll add it to a PR when completed:
policyresources
| where type == "microsoft.authorization/policydefinitions"
| where (properties.policyRule.then.effect in ('append','modify') and properties.policyRule.then matches regex @'tags[') or (properties.policyRule.then.effect == 'deny' and properties.policyRule.if matches regex @'tags[')
| project id
| union (
policyresources
| where type == "microsoft.authorization/policydefinitions"
| where (properties.policyRule.then.effect in ('append','modify') and properties.policyRule.then matches regex @'tags[') or (properties.policyRule.then.effect == 'deny' and properties.policyRule.if matches regex @'tags[')
| project id
| join kind=innerunique (
policyresources
| where type == "microsoft.authorization/policysetdefinitions"
| mv-expand polSetDef = properties.policySetDefinitions
| extend polSetPolDefId = tostring(polSetDef.policyDefinitionId)
| project id,polSetPolDefId
) on $left.id == $right.polSetPolDefId
| project id
)
| join (policyresources
| where type == "microsoft.authorization/policyassignments"
| extend policyDefId = tostring(properties.policyDefinitionId)
| project policyDefId,assignmentName = name,assignmentScope = properties.scope, assignmentId = id, compliant = 1
) on $left.id == $right.policyDefId
| project-away id
| project id = assignmentId,compliant

Running ./checklist_graph.sh --technology=lz--format=json results in a number of compliant=undefined results

I ran ./checklist_graph.sh --technology=lz--format=json in a azure cloud bash shell, but a number of the results had "compliant": "undefined". After a quick look it appears a number of the graph queries in lz_checklist.en.json use Compliant as the projected column (e.g https://github.com/Azure/review-checklists/blob/main/checklists/lz_checklist.en.json#L135), but checklist_graph.sh seems to expect it to be compliant https://github.com/Azure/review-checklists/blob/main/scripts/checklist_graph.sh#L256 (note the casing difference).

Can we create a Managed Identity as part of a bicep deployment

Investigate the use of managed identity within Bicep

"Use federated keyvault"?

Hey there, a0477a20-9945-4bda-9333-4f2491163418 item in the LZ review checklist says "Use a federated Azure Key Vault model to avoid transaction scale limits". I can't seem to be able to find what a "federated AKV model" means. Does this mean "separated AKV instances per app"?

Inconsistent 'Status Options' across the checklists

The AKS checklist has the following status options: Not verified, Open, Fulfilled, Not required, N/A. All other checklists have the same options except for 'Not required', which also causes the use experience for these checklists to see 'N/A' show up twice in the dropdown when reviewing a checklist item in the spreadsheet because an empty spreadsheet still has Column H in the Values page is pre-populated with all status options.

Question: Should all checklists align to the same set of status options that the AKS checklist is already using? I presume yes and I have a draft PR ready to review and approve here #157 but I opened an issue if we there's a need/want to have different status options.

Sample spreadsheet tech/language selection misleading

Selecting which tech/language in the sample spreadsheet uses global variables in the sample spreadsheet, that are set the first time an option is clicked. If the user starts the spreadsheet and clicks the download button, the content of the global variables could be different than the value of the options

"Deploy to Azure" button to be created

Work on a simple "deploy to azure" button.
Work started and base code ready for existing container.

Support tradeoff decisions

Today the checklists include binary recommendations, for example "configure egress traffic through a NGFW". However, design decisions are often a tradeoff between different aspects of a design, and following a certain recommendation might increase one, but decrease another one. For example, injecting an AzFW increases security, but impacts negatively the cost and complexity of the design. Hence, depending on the main goal of a certain architecture, the right answer to the recommendation might vary: for security-optimized designs the recommendation would be one, but for cost-optimized designs the recommendation would be another.

In order to support this, two things would need to be modified:

The answers to the recommendations should allow for more variation. Not only "Fulfilled", but something like "Yes, doing this already" and "ACK'ed but thanks, no thanks"
Recommendations should include a "weight" that give an idea of which design pillars (Security, Cost, Complexity, Resiliency) are the impacting when implemented and when not implemented

Having this in the checklists would allow to do reviews for security-optimized designs, resiliency-optimized designs or cost-optimized designs, for example.

"Question" or "Checklist Item"?

Many entries in all checklist are not in a verbal format that can be identified as a "Question".
For this reason, I would propose to change the name of the 3rd column in the sheet, from "Question" to something like "Checklist Item" or "Controls" or "Item to Check".

parse error: Invalid escape at line 285, column 436

Hello,
I am attempting to use checklist_graph.sh on a Linux VM and connecting to my tenant/subs using Azure CLI.
When running the script the following error occurs:

parse error: Invalid escape at line 285, column 436

Consider adding a version number to checklists

Currently, when checklists are shared around, there is no indication of when the checklist content was imported. For example, if an XLSX file is being passed around to bypass client-side restrictions on macros, the spreadsheet may contain outdated content depending on when it was generated.

Consider adding a version number or "Imported Date" to the pre-generated XLSX files and to the macro import checklist action.

Consider a macro-free approach to the Excel-design

Excel does support Power Query, which is a capability that can be used to use for example an external json file as a datasource. Power Query can pull in the json, massage the data and make it available in certain tables. This would avoid having to deal with macro's. The user action is then a simple push on the "refresh" button. Happy to try this out when I get some bandwidth.

AKS Checklist clarifications

Recommendation item:

If using Azure Disks and AZs, consider dedicating nodes per AZ and WaitForFirstConsumer

The details provided doesn't have adequate clarification on how to dedicate nodes per AZ, and how setting WaitForFirstConsumer will behave (Eg: Will the disk created be pinned in same zone as the VM, and the subsequent pods sharing the PVC scheduled in same zone?) More clarity would be helpful.

Make sure no changes are performed by operators in the node RG (aka 'infra RG')

More details on how to achieve this would be helpful. Applying locks doesn't help and it restricts cluster functionality. If any alternates are there, it would be good to document.

If not using ephemeral disks, use larger OS disks for the nodes, especially if many pods/node

More details on how would large disk help in multi pods/node scenarios would be helpful.