Message ID | 20230504145023.835096-3-ross.philipson@oracle.com |
---|---|
State | Superseded |
Headers | show |
Series | x86: Trenchboot secure dynamic launch Linux kernel support | expand |
On 5/6/23 04:48, Bagas Sanjaya wrote: > On Thu, May 04, 2023 at 02:50:11PM +0000, Ross Philipson wrote: >> +===================================== >> +System Launch Integrity documentation >> +===================================== >> + >> +.. toctree:: > > By convention, doc toctree have 2-level depth (only page title and > first-level headings are visible). You may consider adding > `:maxdepth: 2` option. Will do. > >> diff --git a/Documentation/security/launch-integrity/principles.rst b/Documentation/security/launch-integrity/principles.rst >> new file mode 100644 >> index 0000000..73cf063 >> --- /dev/null >> +++ b/Documentation/security/launch-integrity/principles.rst >> @@ -0,0 +1,313 @@ >> +======================= >> +System Launch Integrity >> +======================= >> + >> +This document serves to establish a common understanding of what is system >> +launch, the integrity concern for system launch, and why using a Root of Trust >> +(RoT) from a Dynamic Launch may be desired. Through out this document >> +terminology from the Trusted Computing Group (TCG) and National Institue for >> +Science and Technology (NIST) is used to ensure a vendor nutrual language is >> +used to describe and reference security-related concepts. >> + >> +System Launch >> +============= >> + >> +There is a tendency to only consider the classical power-on boot as the only >> +means to launch an Operating System (OS) on a computer system, but in fact most >> +modern processors support two methods to launch the system. To provide clarity a >> +common definition of a system launch should be established. This definition is >> +that a during a single power life cycle of a system, a System Launch consists >> +of an initialization event, typically in hardware, that is followed by an >> +executing software payload that takes the system from the initialized state to >> +a running state. Driven by the Trusted Computing Group (TCG) architecture, >> +modern processors are able to support two methods to launch a system, these two >> +types of system launch are known as Static Launch and Dynamic Launch. >> + >> +Static Launch >> +------------- >> + >> +Static launch is the system launch associated with the power cycle of the CPU. >> +Thus static launch refers to the classical power-on boot where the >> +initialization event is the release of the CPU from reset and the system >> +firmware is the software payload that brings the system up to a running state. >> +Since static launch is the system launch associated with the beginning of the >> +power lifecycle of a system, it is therefore a fixed, one-time system launch. >> +It is because of this that static launch is referred to and thought of as being >> +"static". >> + >> +Dynamic Launch >> +-------------- >> + >> +Modern CPUs architectures provides a mechanism to re-initialize the system to a >> +"known good" state without requiring a power event. This re-initialization >> +event is the event for a dynamic launch and is referred to as the Dynamic >> +Launch Event (DLE). The DLE functions by accepting a software payload, referred >> +to as the Dynamic Configuration Environment (DCE), that execution is handed to >> +after the DLE is invoked. The DCE is responsible for bringing the system back >> +to a running state. Since the dynamic launch is not tied to a power event like >> +the static launch, this enables a dynamic launch to be initiated at any time >> +and multiple times during a single power life cycle. This dynamism is the >> +reasoning behind referring to this system launch as being dynamic. >> + >> +Because a dynamic launch can be conducted at any time during a single power >> +life cycle, they are classified into one of two types, an early launch or a >> +late launch. >> + >> +:Early Launch: When a dynamic launch is used as a transition from a static >> + launch chain to the final Operating System. >> + >> +:Late Launch: The usage of a dynamic launch by an executing Operating System to >> + transition to a “known good” state to perform one or more operations, e.g. to >> + launch into a new Operating System. >> + >> +System Integrity >> +================ >> + >> +A computer system can be considered a collection of mechanisms that work >> +together to produce a result. The assurance that the mechanisms are functioning >> +correctly and producing the expected result is the integrity of the system. To >> +ensure a system's integrity there are a subset of these mechanisms, commonly >> +referred to as security mechanisms, that are present to help ensure the system >> +produces the expected result or at least detect the potential of an unexpected >> +result may have happened. Since the security mechanisms are relied upon to >> +ensue the integrity of the system, these mechanisms are trusted. Upon >> +inspection these security mechanisms each have a set of properties and these >> +properties can be evaluated to determine how susceptible a mechanism might be >> +to failure. This assessment is referred to as the Strength of Mechanism and for >> +trusted mechanism enables for the trustworthiness of that mechanism to be >> +quantified. >> + >> +For software systems there are two system states for which the integrity is >> +critical, when the software is loaded into memory and when the software is >> +executing on the hardware. Ensuring that the expected software is load into >> +memory is referred to as load-time integrity while ensuring that the software >> +executing is the expected software is the runtime integrity of that software. >> + >> +Load-time Integrity >> +------------------- >> + >> +It is critical to understand what load-time integrity establishes about a >> +system and what is assumed, i.e. what is being trusted. Load-time integrity is >> +when a trusted entity, i.e. an entity with an assumed integrity, takes an >> +action to assess an entity being loaded into memory before it is used. A >> +variety of mechanisms may be used to conduct the assessment, each with >> +different properties. A particular property is whether the mechanism creates an >> +evidence of the assessment. Often either cryptographic signature checking or >> +hashing are the common assessment operations used. >> + >> +A signature checking assessment functions by requiring a representation of the >> +accepted authorities and uses those representations to assess if the entity has >> +been signed by an accepted authority. The benefit to this process is that >> +assessment process includes an adjudication of the assessment. The drawbacks >> +are that 1) the adjudication is susceptible to tampering by the Trusted >> +Computing Base (TCB), 2) there is no evidence to assert that an untampered >> +adjudication was completed, and 3) the system must be an active participant in >> +the key management infrastructure. >> + >> +A cryptographic hashing assessment does not adjudicate the assessment but >> +instead generates evidence of the assessment to be adjudicated independently. >> +The benefits to this approach is that the assessment may be simple such that it >> +is able to be implemented as an immutable mechanism, e.g. in hardware. >> +Additionally it is possible for the adjudication to be conducted where it >> +cannot be tampered with by the TCB. The drawback is that a compromised >> +environment will be allowed to execute until an adjudication can be completed. >> + >> +Ultimately load-time integrity provides confidence that the correct entity was >> +loaded and in the absence of a run-time integrity mechanism assumes, i.e >> +trusts, that the entity will never become corrupted. >> + >> +Runtime Integrity >> +----------------- >> + >> +Runtime integrity in the general sense is when a trusted entity makes an >> +assessment of an entity at any point in time during the assessed entity's >> +execution. A more concrete explanation is the taking of an integrity assessment >> +of an active process executing on the system at any point during the process' >> +execution. Often the load-time integrity of an operating system's user-space, >> +i.e. the operating environment, is confused to be the runtime integrity of the >> +system since it is an integrity assessment of the "runtime" software. The >> +reality is that actual runtime integrity is a very difficult problem and thus >> +not very many solutions are public and/or available. One example of a runtime >> +integrity solution would be John Hopkins Advanced Physics Labratory's (APL) >> +Linux Kernel Integrity Module (LKIM). >> + >> +Trust Chains >> +============ >> + >> +Bulding upon the understanding of security mechanisms to establish load-time >> +integrity of an entity, it is possible to chain together load-time integrity >> +assessments to establish the integrity of the whole system. This process is >> +known as transitive trust and provides the concept of building a chain of >> +load-time integrity assessments, commonly referred to as a trust chain. These >> +assessments may be used to adjudicate the load-time integrity of the whole >> +system. This trust chain is started by a trusted entity that does the first >> +assessment. This first entity is referred to as the Root of Trust(RoT) with the >> +entities name being derived from the mechanism used for the assessment, i.e. >> +RoT for Verification (RTV) and RoT for Measurement (RTM). >> + >> +A trust chain is itself a mechanism, specifically a mechanism of mechanisms, >> +and therefore it too has a Strength of Mechanism. The factors that contribute >> +to a trust chain's strength are, >> + >> + - The strength of the chain's RoT >> + - The strength of each member of the trust chain >> + - The length, i.e. the number of members, of the chain >> + >> +Therefore to provide the strongest trust chains, they should start with a >> +strong RoT and should consist of members being of low complexity and minimizing >> +the number of members participating as is possible. In a more colloquial sense, >> +a trust chain is only as strong as it weakests link and more links increase >> +the probability of a weak link. >> + >> +Dynamic Launch Components >> +========================= >> + >> +The TCG architecture for dynamic launch is composed of a component series that >> +are used to setup and then carry out the launch. These components work together >> +to construct a RTM trust chain that is rooted in the dynamic launch and thus >> +commonly referred to as the Dynamic Root of Trust for Measurement (DRTM) chain. >> + >> +What follows is a brief explanation of each component in execution order. A >> +subset of these components are what establishes the dynamic launch's trust >> +chain. >> + >> +Dynamic Configuration Environment Preamble >> +------------------------------------------ >> + >> +The Dynamic Configuration Environment (DCE) Preamble is responsible for setting >> +up the system environment in preparation for a dynamic launch. The DCE Preamble >> +is not a part of the DRTM trust chain. >> + >> +Dynamic Launch Event >> +-------------------- >> + >> +The dynamic launch event is the event, typically a CPU instruction, that triggers >> +the system's dynamic launch mechanism to begin the launch. The dynamic launch >> +mechanism is also the RoT for the DRTM trust chain. >> + >> +Dynamic Configuration Environment >> +--------------------------------- >> + >> +The dynamic launch mechanism may have resulted in a reset of a portion of the >> +system. To bring the system back to an adequate state for system software the >> +dynamic launch will hand over control to the DCE. Prior to handing over this >> +control, the dynamic launch will measure the DCE. Once the DCE is complete it >> +will proceed to measure and then execute the Dynamic Launch Measured >> +Environment (DLME). >> + >> +Dynamic Launch Measured Environment >> +----------------------------------- >> + >> +The DLME is the first system kernel to have control of the system but may not >> +be the last. Depending on the usage and configuration, the DLME may be the >> +final/target operating system or it may be a boot loader that will load the >> +final/target operating system. >> + >> +Why DRTM >> +======== >> + >> +It is a fact that DRTM increases the load-time integrity of the system by >> +providing a trust chain that has an immutable hardware RoT, uses a limited >> +number of small, special purpose code to establish the trust chain that starts >> +the target operating system. As mentioned in the Trust Chain section, these are >> +the main three factors in driving up the strength of a trust chain. As can been >> +seen by the BootHole exploit, which in fact did not effect the integrity of >> +DRTM solutions, the sophistication of attacks targeting system launch is at an >> +all time high. There is no reason a system should not employ every integrity >> +measure hardware makes available. This is the crux of a defense-in-depth >> +approach to system security. In the past the now closed SMI gap was often >> +pointed to as invalidating DRTM, which in fact was nothing but a strawman >> +argument. As has continued to be demonstrated, if/when SMM is corrupted it can >> +always circumvent all load-time integrity, SRTM and DRTM, because it is a >> +run-time integrity problem. Regardless, Intel and AMD have both deployed >> +runtime integrity for SMI and SMM which is tied directly to DRTM such that this >> +perceived deficiency is now non-existent and the world is moving forward with >> +an expectation that DRTM must be present. >> + >> +Glossary >> +======== >> + >> +.. glossary:: >> + integrity >> + Guarding against improper information modification or destruction, and >> + includes ensuring information non-repudiation and authenticity. >> + >> + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm >> + >> + mechanism >> + A process or system that is used to produce a particular result. >> + >> + - NIST Special Publication 800-160 (VOLUME 1 ) - https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-160v1.pdf >> + >> + risk >> + A measure of the extent to which an entity is threatened by a potential >> + circumstance or event, and typically a function of: (i) the adverse impacts >> + that would arise if the circumstance or event occurs; and (ii) the >> + likelihood of occurrence. >> + >> + - NIST SP 800-30 Rev. 1 - https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-30r1.pdf >> + >> + security mechanism >> + A device or function designed to provide one or more security services >> + usually rated in terms of strength of service and assurance of the design. >> + >> + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm >> + >> + Strength of Mechanism >> + A scale for measuring the relative strength of a security mechanism >> + >> + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm >> + >> + transitive trust >> + Also known as "Inductive Trust", in this process a Root of Trust gives a >> + trustworthy description of a second group of functions. Based on this >> + description, an interested entity can determine the trust it is to place in >> + this second group of functions. If the interested entity determines that >> + the trust level of the second group of functions is acceptable, the trust >> + boundary is extended from the Root of Trust to include the second group of >> + functions. In this case, the process can be iterated. The second group of >> + functions can give a trustworthy description of the third group of >> + functions, etc. Transitive trust is used to provide a trustworthy >> + description of platform characteristics, and also to prove that >> + non-migratable keys are non-migratable >> + >> + - TCG Glossary - https://trustedcomputinggroup.org/wp-content/uploads/TCG-Glossary-V1.1-Rev-1.0.pdf >> + >> + trust >> + The confidence one element has in another that the second element will >> + behave as expected` >> + >> + - NISTIR 8320A - https://nvlpubs.nist.gov/nistpubs/ir/2021/NIST.IR.8320A.pdf >> + >> + trust anchor >> + An authoritative entity for which trust is assumed. >> + >> + - NIST SP 800-57 Part 1 Rev. 5 - https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-57pt1r5.pdf >> + >> + trusted >> + An element that another element relies upon to fulfill critical >> + requirements on its behalf. >> + >> + - NISTIR 8320A - https://nvlpubs.nist.gov/nistpubs/ir/2021/NIST.IR.8320A.pdf >> + >> + trusted computing base (TCB) >> + Totality of protection mechanisms within a computer system, including >> + hardware, firmware, and software, the combination responsible for enforcing >> + a security policy. >> + >> + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm >> + >> + trusted computer system >> + A system that has the necessary security functions and assurance that the >> + security policy will be enforced and that can process a range of >> + information sensitivities (i.e. classified, controlled unclassified >> + information (CUI), or unclassified public information) simultaneously. >> + >> + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm >> + >> + trustworthiness >> + The attribute of a person or enterprise that provides confidence to others >> + of the qualifications, capabilities, and reliability of that entity to >> + perform specific tasks and fulfill assigned responsibilities. >> + >> + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm >> diff --git a/Documentation/security/launch-integrity/secure_launch_details.rst b/Documentation/security/launch-integrity/secure_launch_details.rst >> new file mode 100644 >> index 0000000..2e71543 >> --- /dev/null >> +++ b/Documentation/security/launch-integrity/secure_launch_details.rst >> @@ -0,0 +1,564 @@ >> +=================================== >> +Secure Launch Config and Interfaces >> +=================================== >> + >> +Configuration >> +============= >> + >> +The settings to enable Secure Launch using Kconfig are under:: >> + >> + "Processor type and features" --> "Secure Launch support" >> + >> +A kernel with this option enabled can still be booted using other supported >> +methods. >> + >> +To reduce the Trusted Computing Base (TCB) of the MLE [1]_, the build >> +configuration should be pared down as narrowly as one's use case allows. >> +The fewer drivers (less active hardware) and features reduces the attack >> +surface. E.g. in the extreme, the MLE could only have local disk access >> +and no other hardware support. Or only network access for remote attestation. >> + >> +It is also desirable if possible to embed the initrd used with the MLE kernel >> +image to reduce complexity. >> + >> +The following are a few important configuration necessities to always consider: >> + >> +KASLR Configuration >> +------------------- >> + >> +Secure Launch does not interoperate with KASLR. If possible, the MLE should be >> +built with KASLR disabled:: >> + >> + "Processor type and features" --> >> + "Build a relocatable kernel" --> >> + "Randomize the address of the kernel image (KASLR) [ ]" >> + >> +This unsets the Kconfig value CONFIG_RANDOMIZE_BASE. >> + >> +If not possible, KASLR must be disabled on the kernel command line when doing >> +a Secure Launch as follows:: >> + >> + nokaslr >> + >> +IOMMU Configuration >> +------------------- >> + >> +When doing a Secure Launch, the IOMMU should always be enabled and the drivers >> +loaded. However, IOMMU passthrough mode should never be used. This leaves the >> +MLE completely exposed to DMA after the PMR's [2]_ are disabled. The current default >> +mode is to use IOMMU in lazy translated mode but strict translated mode is the preferred >> +IOMMU mode and this should be selected in the build configuration:: >> + >> + "Device Drivers" --> >> + "IOMMU Hardware Support" --> >> + "IOMMU default domain type" --> >> + "(X) Translated - Strict" >> + >> +In addition, the Intel IOMMU should be on by default. The following sets this as the >> +default in the build configuration:: >> + >> + "Device Drivers" --> >> + "IOMMU Hardware Support" --> >> + "Support for Intel IOMMU using DMA Remapping Devices [*]" >> + >> +and:: >> + >> + "Device Drivers" --> >> + "IOMMU Hardware Support" --> >> + "Support for Intel IOMMU using DMA Remapping Devices [*]" --> >> + "Enable Intel DMA Remapping Devices by default [*]" >> + >> +It is recommended that no other command line options should be set to override >> +the defaults above. >> + >> +Secure Launch Resource Table >> +============================ >> + >> +The Secure Launch Resource Table (SLRT) is a platform-agnostic, standard format >> +for providing information for the pre-launch environment and to pass >> +information to the post-launch environment. The table is populated by one or >> +more bootloaders in the boot chain and used by Secure Launch on how to setup >> +the environment during post-launch. The details for the SLRT are documented >> +in the TrenchBoot Secure Launch Specifcation [3]_. >> + >> +Intel TXT Interface >> +=================== >> + >> +The primary interfaces between the various components in TXT are the TXT MMIO >> +registers and the TXT heap. The MMIO register banks are described in Appendix B >> +of the TXT MLE [1]_ Development Guide. >> + >> +The TXT heap is described in Appendix C of the TXT MLE [1]_ Development >> +Guide. Most of the TXT heap is predefined in the specification. The heap is >> +initialized by firmware and the pre-launch environment and is subsequently used >> +by the SINIT ACM. One section, called the OS to MLE Data Table, is reserved for >> +software to define. This table is set up per the recommendation detailed in >> +Appendix B of the TrenchBoot Secure Launch Specification:: >> + >> + /* >> + * Secure Launch defined OS/MLE TXT Heap table >> + */ >> + struct txt_os_mle_data { >> + u32 version; >> + u32 boot_params_addr; >> + struct slr_table *slrt; >> + u64 txt_info; >> + u32 ap_wake_block; >> + u32 ap_wake_block_size; >> + u8 mle_scratch[64]; >> + } __packed; >> + >> +Description of structure: >> + >> +===================== ======================================================================== >> +Field Use >> +===================== ======================================================================== >> +version Structure version, current value 1 >> +boot_params_addr Physical base address of the Linux boot parameters >> +slrt Physical address of the Secure Launch Resource Table >> +txt_info Pointer into the SLRT for easily locating TXT specific table >> +ap_wake_block Physical address of the block of memory for parking APs after a launch >> +ap_wake_block_size Size of the AP wake block >> +mle_scratch Scratch area used post-launch by the MLE kernel. Fields: >> + >> + - SL_SCRATCH_AP_EBX area to share %ebx base pointer among CPUs >> + - SL_SCRATCH_AP_JMP_OFFSET offset to abs. ljmp fixup location for APs >> +===================== ======================================================================== >> + >> +Error Codes >> +----------- >> + >> +The TXT specification defines the layout for TXT 32 bit error code values. >> +The bit encodings indicate where the error originated (e.g. with the CPU, >> +in the SINIT ACM, in software). The error is written to a sticky TXT >> +register that persists across resets called TXT.ERRORCODE (see the TXT >> +MLE Development Guide). The errors defined by the Secure Launch feature are >> +those generated in the MLE software. They have the format:: >> + >> + 0xc0008XXX >> + >> +The low 12 bits are free for defining the following Secure Launch specific >> +error codes. >> + >> +====== ================ >> +Name: SL_ERROR_GENERIC >> +Value: 0xc0008001 >> +====== ================ >> + >> +Description: >> + >> +Generic catch all error. Currently unused. >> + >> +====== ================= >> +Name: SL_ERROR_TPM_INIT >> +Value: 0xc0008002 >> +====== ================= >> + >> +Description: >> + >> +The Secure Launch code failed to get an access to the TPM hardware interface. >> +This is most likely to due to misconfigured hardware or kernel. Ensure the >> +TPM chip is enabled and the kernel TPM support is built in (it should not be >> +built as a module). >> + >> +====== ========================== >> +Name: SL_ERROR_TPM_INVALID_LOG20 >> +Value: 0xc0008003 >> +====== ========================== >> + >> +Description: >> + >> +The Secure Launch code failed to find a valid event log descriptor for TPM >> +version 2.0 or the event log descriptor is malformed. Usually this indicates >> +that incompatible versions of the pre-launch environment and the MLE kernel. >> +The pre-launch environment and the kernel share a structure in the TXT heap and >> +if this structure (the OS-MLE table) is mismatched, this error is often seen. >> +This TXT heap area is setup by the pre-launch environment so the issue may >> +originate there. It could be the sign of an attempted attack. >> + >> +====== =========================== >> +Name: SL_ERROR_TPM_LOGGING_FAILED >> +Value: 0xc0008004 >> +====== =========================== >> + >> +Description: >> + >> +There was a failed attempt to write a TPM event to the event log early in the >> +Secure Launch process. This is likely the result of a malformed TPM event log >> +buffer. Formatting of the event log buffer information is done by the >> +pre-launch environment so the issue most likely originates there. >> + >> +====== ============================ >> +Name: SL_ERROR_REGION_STRADDLE_4GB >> +Value: 0xc0008005 >> +====== ============================ >> + >> +Description: >> + >> +During early validation a buffer or region was found to straddle the 4GB >> +boundary. Because of the way TXT does DMA memory protection, this is an >> +unsafe configuration and is flagged as an error. This is most likely a >> +configuration issue in the pre-launch environment. It could also be the sign of >> +an attempted attack. >> + >> +====== =================== >> +Name: SL_ERROR_TPM_EXTEND >> +Value: 0xc0008006 >> +====== =================== >> + >> +Description: >> + >> +There was a failed attempt to extend a TPM PCR in the Secure Launch platform >> +module. This is most likely to due to misconfigured hardware or kernel. Ensure >> +the TPM chip is enabled and the kernel TPM support is built in (it should not >> +be built as a module). >> + >> +====== ====================== >> +Name: SL_ERROR_MTRR_INV_VCNT >> +Value: 0xc0008007 >> +====== ====================== >> + >> +Description: >> + >> +During early Secure Launch validation an invalid variable MTRR count was found. >> +The pre-launch environment passes a number of MSR values to the MLE to restore >> +including the MTRRs. The values are restored by the Secure Launch early entry >> +point code. After measuring the values supplied by the pre-launch environment, >> +a discrepancy was found validating the values. It could be the sign of an >> +attempted attack. >> + >> +====== ========================== >> +Name: SL_ERROR_MTRR_INV_DEF_TYPE >> +Value: 0xc0008008 >> +====== ========================== >> + >> +Description: >> + >> +During early Secure Launch validation an invalid default MTRR type was found. >> +See SL_ERROR_MTRR_INV_VCNT for more details. >> + >> +====== ====================== >> +Name: SL_ERROR_MTRR_INV_BASE >> +Value: 0xc0008009 >> +====== ====================== >> + >> +Description: >> + >> +During early Secure Launch validation an invalid variable MTRR base value was >> +found. See SL_ERROR_MTRR_INV_VCNT for more details. >> + >> +====== ====================== >> +Name: SL_ERROR_MTRR_INV_MASK >> +Value: 0xc000800a >> +====== ====================== >> + >> +Description: >> + >> +During early Secure Launch validation an invalid variable MTRR mask value was >> +found. See SL_ERROR_MTRR_INV_VCNT for more details. >> + >> +====== ======================== >> +Name: SL_ERROR_MSR_INV_MISC_EN >> +Value: 0xc000800b >> +====== ======================== >> + >> +Description: >> + >> +During early Secure Launch validation an invalid miscellaneous enable MSR value >> +was found. See SL_ERROR_MTRR_INV_VCNT for more details. >> + >> +====== ========================= >> +Name: SL_ERROR_INV_AP_INTERRUPT >> +Value: 0xc000800c >> +====== ========================= >> + >> +Description: >> + >> +The application processors (APs) wait to be woken up by the SMP initialization >> +code. The only interrupt that they expect is an NMI; all other interrupts >> +should be masked. If an AP gets some other interrupt other than an NMI it will >> +cause this error. This error is very unlikely to occur. >> + >> +====== ========================= >> +Name: SL_ERROR_INTEGER_OVERFLOW >> +Value: 0xc000800d >> +====== ========================= >> + >> +Description: >> + >> +A buffer base and size passed to the MLE caused an integer overflow when >> +added together. This is most likely a configuration issue in the pre-launch >> +environment. It could also be the sign of an attempted attack. >> + >> +====== ================== >> +Name: SL_ERROR_HEAP_WALK >> +Value: 0xc000800e >> +====== ================== >> + >> +Description: >> + >> +An error occurred in TXT heap walking code. The underlying issue is a failure to >> +early_memremap() portions of the heap, most likely due to a resource shortage. >> + >> +====== ================= >> +Name: SL_ERROR_HEAP_MAP >> +Value: 0xc000800f >> +====== ================= >> + >> +Description: >> + >> +This error is essentially the same as SL_ERROR_HEAP_WALK but occurred during the >> +actual early_memremap() operation. >> + >> +====== ========================= >> +Name: SL_ERROR_REGION_ABOVE_4GB >> +Value: 0xc0008010 >> +====== ========================= >> + >> +Description: >> + >> +A memory region used by the MLE is above 4GB. In general this is not a problem >> +because memory > 4Gb can be protected from DMA. There are certain buffers that >> +should never be above 4Gb though and one of these caused the violation. This is >> +most likely a configuration issue in the pre-launch environment. It could also >> +be the sign of an attempted attack. >> + >> +====== ========================== >> +Name: SL_ERROR_HEAP_INVALID_DMAR >> +Value: 0xc0008011 >> +====== ========================== >> + >> +Description: >> + >> +The backup copy of the ACPI DMAR table which is supposed to be located in the >> +TXT heap could not be found. This is due to a bug in the platform's ACM module >> +or in firmware. >> + >> +====== ======================= >> +Name: SL_ERROR_HEAP_DMAR_SIZE >> +Value: 0xc0008012 >> +====== ======================= >> + >> +Description: >> + >> +The backup copy of the ACPI DMAR table in the TXT heap is to large to be stored >> +for later usage. This error is very unlikely to occur since the area reserved >> +for the copy is far larger than the DMAR should be. >> + >> +====== ====================== >> +Name: SL_ERROR_HEAP_DMAR_MAP >> +Value: 0xc0008013 >> +====== ====================== >> + >> +Description: >> + >> +The backup copy of the ACPI DMAR table in the TXT heap could not be mapped. The >> +underlying issue is a failure to early_memremap() the DMAR table, most likely >> +due to a resource shortage. >> + >> +====== ==================== >> +Name: SL_ERROR_HI_PMR_BASE >> +Value: 0xc0008014 >> +====== ==================== >> + >> +Description: >> + >> +On a system with more than 4G of RAM, the high PMR [2]_ base address should be set >> +to 4G. This error is due to that not being the case. This PMR value is set by >> +the pre-launch environment so the issue most likely originates there. It could also >> +be the sign of an attempted attack. >> + >> +====== ==================== >> +Name: SL_ERROR_HI_PMR_SIZE >> +Value: 0xc0008015 >> +====== ==================== >> + >> +Description: >> + >> +On a system with more than 4G of RAM, the high PMR [2]_ size should be set to cover >> +all RAM > 4G. This error is due to that not being the case. This PMR value is >> +set by the pre-launch environment so the issue most likely originates there. It >> +could also be the sign of an attempted attack. >> + >> +====== ==================== >> +Name: SL_ERROR_LO_PMR_BASE >> +Value: 0xc0008016 >> +====== ==================== >> + >> +Description: >> + >> +The low PMR [2]_ base should always be set to address zero. This error is due to >> +that not being the case. This PMR value is set by the pre-launch environment >> +so the issue most likely originates there. It could also be the sign of an attempted >> +attack. >> + >> +====== ==================== >> +Name: SL_ERROR_LO_PMR_MLE >> +Value: 0xc0008017 >> +====== ==================== >> + >> +Description: >> + >> +This error indicates the MLE image is not covered by the low PMR [2]_ range. The >> +PMR values are set by the pre-launch environment so the issue most likely originates >> +there. It could also be the sign of an attempted attack. >> + >> +====== ======================= >> +Name: SL_ERROR_INITRD_TOO_BIG >> +Value: 0xc0008018 >> +====== ======================= >> + >> +Description: >> + >> +The external initrd provided is larger than 4Gb. This is not a valid >> +configuration for a Secure Launch due to managing DMA protection. >> + >> +====== ========================= >> +Name: SL_ERROR_HEAP_ZERO_OFFSET >> +Value: 0xc0008019 >> +====== ========================= >> + >> +Description: >> + >> +During a TXT heap walk an invalid/zero next table offset value was found. This >> +indicates the TXT heap is malformed. The TXT heap is initialized by the >> +pre-launch environment so the issue most likely originates there. It could also >> +be a sign of an attempted attack. In addition, ACM is also responsible for >> +manipulating parts of the TXT heap so the issue could be due to a bug in the >> +platform's ACM module. >> + >> +====== ============================= >> +Name: SL_ERROR_WAKE_BLOCK_TOO_SMALL >> +Value: 0xc000801a >> +====== ============================= >> + >> +Description: >> + >> +The AP wake block buffer passed to the MLE via the OS-MLE TXT heap table is not >> +large enough. This value is set by the pre-launch environment so the issue most >> +likely originates there. It also could be the sign of an attempted attack. >> + >> +====== =========================== >> +Name: SL_ERROR_MLE_BUFFER_OVERLAP >> +Value: 0xc000801b >> +====== =========================== >> + >> +Description: >> + >> +One of the buffers passed to the MLE via the OS-MLE TXT heap table overlaps >> +with the MLE image in memory. This value is set by the pre-launch environment >> +so the issue most likely originates there. It could also be the sign of an attempted >> +attack. >> + >> +====== ========================== >> +Name: SL_ERROR_BUFFER_BEYOND_PMR >> +Value: 0xc000801c >> +====== ========================== >> + >> +Description: >> + >> +One of the buffers passed to the MLE via the OS-MLE TXT heap table is not >> +protected by a PMR. This value is set by the pre-launch environment so the >> +issue most likey originates there. It could also be the sign of an attempted >> +attack. >> + >> +====== ============================= >> +Name: SL_ERROR_OS_SINIT_BAD_VERSION >> +Value: 0xc000801d >> +====== ============================= >> + >> +Description: >> + >> +The version of the OS-SINIT TXT heap table is bad. It must be 6 or greater. >> +This value is set by the pre-launch environment so the issue most likely >> +originates there. It could also be the sign of an attempted attack. It is also >> +possible though very unlikely that the platform is so old that the ACM being >> +used requires an unsupported version. >> + >> +====== ===================== >> +Name: SL_ERROR_EVENTLOG_MAP >> +Value: 0xc000801e >> +====== ===================== >> + >> +Description: >> + >> +An error occurred in the Secure Launch module while mapping the TPM event log. >> +The underlying issue is memremap() failure, most likely due to a resource >> +shortage. >> + >> +====== ======================== >> +Name: SL_ERROR_TPM_NUMBER_ALGS >> +Value: 0xc000801f >> +====== ======================== >> + >> +Description: >> + >> +The TPM 2.0 event log reports an unsupported number of hashing algorithms. >> +Secure launch currently only supports a maximum of two: SHA1 and SHA256. >> + >> +====== =========================== >> +Name: SL_ERROR_TPM_UNKNOWN_DIGEST >> +Value: 0xc0008020 >> +====== =========================== >> + >> +Description: >> + >> +The TPM 2.0 event log reports an unsupported hashing algorithm. Secure launch >> +currently only supports two algorithms: SHA1 and SHA256. >> + >> +====== ========================== >> +Name: SL_ERROR_TPM_INVALID_EVENT >> +Value: 0xc0008021 >> +====== ========================== >> + >> +Description: >> + >> +An invalid/malformed event was found in the TPM event log while reading it. >> +Since only trusted entities are supposed to be writing the event log, this >> +would indicate either a bug or a possible attack. >> + >> +====== ===================== >> +Name: SL_ERROR_INVALID_SLRT >> +Value: 0xc0008022 >> +====== ===================== >> + >> +Description: >> + >> +The Secure Launch Resource Table is invalid or malformed and is unusable. >> +This implies the pre-launch code did not properly setup the SLRT. >> + >> +====== =========================== >> +Name: SL_ERROR_SLRT_MISSING_ENTRY >> +Value: 0xc0008023 >> +====== =========================== >> + >> +Description: >> + >> +The Secure Launch Resource Table is missing a required entry within it. >> +This implies the pre-launch code did not properly setup the SLRT. >> + >> +====== ================= >> +Name: SL_ERROR_SLRT_MAP >> +Value: 0xc0008024 >> +====== ================= >> + >> +Description: >> + >> +An error occurred in the Secure Launch module while mapping the Secure Launch >> +Resource table. The underlying issue is memremap() failure, most likely due to >> +a resource shortage. >> + >> +.. [1] >> + MLE: Measured Launch Environment is the binary runtime that is measured and >> + then run by the TXT SINIT ACM. The TXT MLE Development Guide describes the >> + requirements for the MLE in detail. >> + >> +.. [2] >> + PMR: Intel VTd has a feature in the IOMMU called Protected Memory Registers. >> + There are two of these registers and they allow all DMA to be blocked >> + to large areas of memory. The low PMR can cover all memory below 4Gb on 2Mb >> + boundaries. The high PMR can cover all RAM on the system, again on 2Mb >> + boundaries. This feature is used during a Secure Launch by TXT. >> + >> +.. [3] >> + Secure Launch Specification: https://trenchboot.org/specifications/Secure_Launch/ >> diff --git a/Documentation/security/launch-integrity/secure_launch_overview.rst b/Documentation/security/launch-integrity/secure_launch_overview.rst >> new file mode 100644 >> index 0000000..ba91d73 >> --- /dev/null >> +++ b/Documentation/security/launch-integrity/secure_launch_overview.rst >> @@ -0,0 +1,220 @@ >> +====================== >> +Secure Launch Overview >> +====================== >> + >> +Overview >> +======== >> + >> +Prior to the start of the TrenchBoot project, the only active Open Source >> +project supporting dynamic launch was Intel's tboot project to support their >> +implementation of dynamic launch known as Intel Trusted eXecution Technology >> +(TXT). The approach taken by tboot was to provide an exokernel that could >> +handle the launch protocol implemented by Intel's special loader, the SINIT >> +Authenticated Code Module (ACM [2]_) and remained in memory to manage the SMX >> +CPU mode that a dynamic launch would put a system. While it is not precluded >> +from being used for doing a late launch, tboot's primary use case was to be >> +used as an early launch solution. As a result the TrenchBoot project started >> +the development of Secure Launch kernel feature to provide a more generalized >> +approach. The focus of the effort is twofold, the first is to make the Linux >> +kernel directly aware of the launch protocol used by Intel, AMD/Hygon, Arm, and >> +potentially OpenPOWER. The second is to make the Linux kernel be able to >> +initiate a dynamic launch. It is through this approach that the Secure Launch >> +kernel feature creates a basis for the Linux kernel to be used in a variety of >> +dynamic launch use cases. >> + >> +.. note:: >> + A quick note on terminology. The larger open source project itself is >> + called TrenchBoot, which is hosted on GitHub (links below). The kernel >> + feature enabling the use of the x86 technology is referred to as "Secure >> + Launch" within the kernel code. >> + >> +Goals >> +===== >> + >> +The first use case that the TrenchBoot project focused on was the ability for >> +the Linux kernel to be started by a dynamic launch, in particular as part of an >> +early launch sequence. In this case the dynamic launch will be initiated by any >> +boot loader with associated support added to it, for example the first targeted >> +boot loader in this case was GRUB2. An integral part of establishing a >> +measurement-based launch integrity involves measuring everything that is >> +intended to be executed (kernel image, initrd, etc) and everything that will >> +configure that kernel to execute (command line, boot params, etc). Then storing >> +those measurements in a protected manner. Both the Intel and AMD dynamic launch >> +implementations leverage the Trusted Platform Module (TPM) to store those >> +measurements. The TPM itself has been designed such that a dynamic launch >> +unlocks a specific set of Platform Configuration Registers (PCR) for holding >> +measurement taken during the dynamic launch. These are referred to as the DRTM >> +PCRs, PCRs 17-22. Further details on this process can be found in the >> +documentation for the GETSEC instruction provided by Intel's TXT and the SKINIT >> +instruction provided by AMD's AMD-V. The documentation on these technologies >> +can be readily found online; see the `Resources`_ section below for references. >> + >> +.. note:: >> + Currently only Intel TXT is supported in this first release of the Secure >> + Launch feature. AMD/Hygon SKINIT and Arm support will be added in a >> + subsequent release. >> + >> +To enable the kernel to be launched by GETSEC a stub, the Secure Launch stub, >> +must be built into the setup section of the compressed kernel to handle the >> +specific state that the dynamic launch process leaves the BSP. Also the Secure >> +Launch stub must measure everything that is going to be used as early as >> +possible. This stub code and subsequent code must also deal with the specific >> +state that the dynamic launch leaves the APs as well. >> + >> +Design Decisions >> +================ >> + >> +A number of design decisions were made during the development of the Secure >> +Launch feature. The two primary guiding decisions were: >> + >> + - Keeping the Secure Launch code as separate from the rest of the kernel >> + as possible. >> + - Modifying the existing boot path of the kernel as little as possible. >> + >> +The following illustrate how the implementation followed these design >> +decisions: >> + >> + - All the entry point code necessary to properly configure the system post >> + launch is found in st_stub.S in the compressed kernel image. This code >> + validates the state of the system, restores necessary system operating >> + configurations and properly handles post launch CPU states. >> + - After the sl_stub.S is complete, it jumps directly to the unmodified >> + startup_32 kernel entry point. >> + - A single call is made to a function sl_main() prior to the main kernel >> + decompression step. This code performs further validation and takes the >> + needed DRTM measurements. >> + - After the call to sl_main(), the main kernel is decompressed and boots as >> + it normally would. >> + - Final setup for the Secure Launch kernel is done in a separate Secure >> + Launch module that is loaded via a late initcall. This code is responsible >> + for extending the measurements taken earlier into the TPM DRTM PCRs and >> + setting up the securityfs interface to allow access the TPM event log and >> + public TXT registers. >> + - On the reboot and kexec paths, calls are made to a function to finalize the >> + state of the Secure Launch kernel. >> + >> +The one place where Secure Launch code is mixed directly in with kernel code is >> +in the SMP boot code. This is due to the unique state that the dynamic launch >> +leaves the APs in. On Intel this involves using a method other than the >> +standard INIT-SIPI sequence. >> + >> +A final note is that originally the extending of the PCRs was completed in the >> +Secure Launch stub when the measurements were taken. An alternative solution >> +had to be implemented due to the TPM maintainers objecting to the PCR >> +extensions being done with a minimal interface to the TPM that was an >> +independent implementation of the mainline kernel driver. Since the mainline >> +driver relies heavily on kernel interfaces not available in the compressed >> +kernel, it was not possible to reuse the mainline TPM driver. This resulted in >> +the decision to move the extension operations to the Secure Launch module in >> +the mainline kernel where the TPM driver would be available. >> + >> +Basic Boot Flow >> +=============== >> + >> +Outlined here is summary of the boot flow for Secure Launch. A more detailed >> +review of Secure Launch process can be found in the Secure Launch >> +Specification, a link is located in the `Resources`_ section. >> + >> +Pre-launch: *Phase where the environment is prepared and configured to initiate the >> +secure launch by the boot chain.* >> + >> + - The SLRT is initialized and dl_stub is placed in memory. >> + - Load the kernel, initrd and ACM [2]_ into memory. >> + - Setup the TXT heap and page tables describing the MLE [1]_ per the >> + specification. >> + - If non-UEFI platform, dl_stub is called. >> + - If UEFI platforms, SLRT registered with UEFI and efi-stub called. >> + - Upon completion, efi-stub will call EBS followed by dl_stub. >> + - The dl_stub will prepare the CPU and the TPM for the launch. >> + - The secure launch is then initiated with the GETSET[SENTER] instruction. >> + >> +Post-launch: *Phase where control is passed from the ACM to the MLE and the secure >> +kernel begins execution.* >> + >> + - Entry from the dynamic launch jumps to the SL stub. >> + - SL stub fixes up the world on the BSP. >> + - For TXT, SL stub wakes the APs, fixes up their worlds. >> + - For TXT, APs are left halted waiting for an NMI to wake them. >> + - SL stub jumps to startup_32. >> + - SL main does validation of buffers and memory locations. It sets >> + the boot parameter loadflag value SLAUNCH_FLAG to inform the main >> + kernel that a Secure Launch was done. >> + - SL main locates the TPM event log and writes the measurements of >> + configuration and module information into it. >> + - Kernel boot proceeds normally from this point. >> + - During early setup, slaunch_setup() runs to finish some validation >> + and setup tasks. >> + - The SMP bring up code is modified to wake the waiting APs. APs vector >> + to rmpiggy and start up normally from that point. >> + - SL platform module is registered as a late initcall module. It reads >> + the TPM event log and extends the measurements taken into the TPM PCRs. >> + - SL platform module initializes the securityfs interface to allow >> + access to the TPM event log and TXT public registers. >> + - Kernel boot finishes booting normally >> + - SEXIT support to leave SMX mode is present on the kexec path and >> + the various reboot paths (poweroff, reset, halt). >> + >> +PCR Usage >> +========= >> + >> +The TCG DRTM architecture there are three PCRs defined for usage, PCR.Details >> +(PCR17), PCR.Authorities (PCR18), and PCR.DLME_Authority (PCR19). For a deeper >> +understanding of Detail and Authorities it is recommended to review the TCG >> +DRTM architecture. >> + >> +To determine PCR usage, Linux Secure Launch follows the TrenchBoot Secure >> +Launch Specification of using a measurement policy stored in the SLRT. The >> +policy details what should be measured and the PCR in which to store the >> +measurement. The measurement policy provides the ability to select the >> +PCR.DLME_Detail (PCR20) PCR as the location for the DRTM components measured by >> +the kernel, e.g. external initrd image. This can then be combined with storing >> +the user authority in the PCR.DLME_Authority PCR to seal/attest to different >> +variations of platform details/authorities and user details/authorities. An >> +example of how this can be achieved was presented in the FOSDEM - 2021 talk >> +"Secure Upgrades with DRTM". >> + >> +Resources >> +========= >> + >> +The TrenchBoot project: >> + >> +https://trenchboot.org >> + >> +Secure Launch Specification: >> + >> +https://trenchboot.org/specifications/Secure_Launch/ >> + >> +Trusted Computing Group's D-RTM Architecture: >> + >> +https://trustedcomputinggroup.org/wp-content/uploads/TCG_D-RTM_Architecture_v1-0_Published_06172013.pdf >> + >> +TXT documentation in the Intel TXT MLE Development Guide: >> + >> +https://www.intel.com/content/dam/www/public/us/en/documents/guides/intel-txt-software-development-guide.pdf >> + >> +TXT instructions documentation in the Intel SDM Instruction Set volume: >> + >> +https://software.intel.com/en-us/articles/intel-sdm >> + >> +AMD SKINIT documentation in the System Programming manual: >> + >> +https://www.amd.com/system/files/TechDocs/24593.pdf >> + >> +GRUB Secure Launch support: >> + >> +https://github.com/TrenchBoot/grub/tree/grub-sl-fc-38-dlstub >> + >> +FOSDEM 2021: Secure Upgrades with DRTM >> + >> +https://archive.fosdem.org/2021/schedule/event/firmware_suwd/ >> + >> +.. [1] >> + MLE: Measured Launch Environment is the binary runtime that is measured and >> + then run by the TXT SINIT ACM. The TXT MLE Development Guide describes the >> + requirements for the MLE in detail. >> + >> +.. [2] >> + ACM: Intel's Authenticated Code Module. This is the 32b bit binary blob that >> + is run securely by the GETSEC[SENTER] during a measured launch. It is described >> + in the Intel documentation on TXT and versions for various chipsets are >> + signed and distributed by Intel. > > The formatting LGTM, thanks! > > Regardless, > > Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Thank you, Ross >
On Thu, May 04, 2023 at 02:50:11PM +0000, Ross Philipson wrote: > +Secure Launch does not interoperate with KASLR. If possible, the MLE should be > +built with KASLR disabled:: Why does Secure Launch not interoperate with KASLR? Re: IOMMUs > +It is recommended that no other command line options should be set to override > +the defaults above. What happens if they are? Does doing so change the security posture of the system? If so, will the measurements be different in a way that demonstrates the system is in an insecure state?
On Thu, May 04 2023 at 14:50, Ross Philipson wrote: > +KASLR Configuration > +------------------- > + > +Secure Launch does not interoperate with KASLR. If possible, the MLE should be > +built with KASLR disabled:: Why? > + "Processor type and features" --> > + "Build a relocatable kernel" --> > + "Randomize the address of the kernel image (KASLR) [ ]" > + > +This unsets the Kconfig value CONFIG_RANDOMIZE_BASE. > + > +If not possible, KASLR must be disabled on the kernel command line when doing > +a Secure Launch as follows:: > + > + nokaslr So what happens if KASLR is enabled in Kconfig and not disabled on the command line? > +IOMMU Configuration > +------------------- > + > +When doing a Secure Launch, the IOMMU should always be enabled and the drivers > +loaded. However, IOMMU passthrough mode should never be used. This leaves the > +MLE completely exposed to DMA after the PMR's [2]_ are disabled. The current default > +mode is to use IOMMU in lazy translated mode but strict translated mode is the preferred > +IOMMU mode and this should be selected in the build configuration:: > + > + "Device Drivers" --> > + "IOMMU Hardware Support" --> > + "IOMMU default domain type" --> > + "(X) Translated - Strict" > + > +In addition, the Intel IOMMU should be on by default. The following sets this as the > +default in the build configuration:: > + > + "Device Drivers" --> > + "IOMMU Hardware Support" --> > + "Support for Intel IOMMU using DMA Remapping Devices [*]" > + > +and:: > + > + "Device Drivers" --> > + "IOMMU Hardware Support" --> > + "Support for Intel IOMMU using DMA Remapping Devices [*]" --> > + "Enable Intel DMA Remapping Devices by default [*]" > + > +It is recommended that no other command line options should be set to override > +the defaults above. Is any of this validated and are proper warnings emitted or is it just recommended and left to the user to do the right thing? Thanks, tglx
On 5/12/23 06:47, Matthew Garrett wrote: > On Thu, May 04, 2023 at 02:50:11PM +0000, Ross Philipson wrote: >> +Secure Launch does not interoperate with KASLR. If possible, the MLE should be >> +built with KASLR disabled:: > > Why does Secure Launch not interoperate with KASLR? > > Re: IOMMUs Until the IOMMU driver comes online, memory is protected by the PMRs regions requested by the Preamble (pre-launch code) in accordance with Intel TXT specifications and configured by the ACM. The KASLR randomizer will run before the IOMMU driver is able to come online and ensure frames used by the kernel are protected as well as frames that a driver may registered in a BAR are not blocked. >> +It is recommended that no other command line options should be set to override >> +the defaults above. > > What happens if they are? Does doing so change the security posture of > the system? If so, will the measurements be different in a way that > demonstrates the system is in an insecure state? > In an early version of the patch series this was enforced when turning on Secure Launch, but concerns were raised over this approach and was asked to allow the user to be able to shoot themselves in the foot. Overriding these values could render either an insecure state and/or an unstable system.
On Fri, Jun 16, 2023 at 12:44:27PM -0400, Daniel P. Smith wrote: > > On 5/12/23 06:47, Matthew Garrett wrote: > > On Thu, May 04, 2023 at 02:50:11PM +0000, Ross Philipson wrote: > > > +Secure Launch does not interoperate with KASLR. If possible, the MLE should be > > > +built with KASLR disabled:: > > > > Why does Secure Launch not interoperate with KASLR? > > > > Re: IOMMUs > > Until the IOMMU driver comes online, memory is protected by the PMRs regions > requested by the Preamble (pre-launch code) in accordance with Intel TXT > specifications and configured by the ACM. The KASLR randomizer will run > before the IOMMU driver is able to come online and ensure frames used by the > kernel are protected as well as frames that a driver may registered in a BAR > are not blocked. This seems unfortunate. Presumably we're not able to modify the PMRs at this point? This also seems like a potential issue for IOMMU config in general - the presumption is that the firmware should be configuring the IOMMU in such a way that DMA-capable devices can't attack the firmware while we're in the boot environment, and if KASLR is leaving a window there then it seems like we'd need to fix that? > > > +It is recommended that no other command line options should be set to override > > > +the defaults above. > > > > What happens if they are? Does doing so change the security posture of > > the system? If so, will the measurements be different in a way that > > demonstrates the system is in an insecure state? > > > > In an early version of the patch series this was enforced when turning on > Secure Launch, but concerns were raised over this approach and was asked to > allow the user to be able to shoot themselves in the foot. Overriding these > values could render either an insecure state and/or an unstable system. If we're in an insecure state, is that something that would show up in the form of different measurements?
On 6/16/23 12:54, Matthew Garrett wrote: > On Fri, Jun 16, 2023 at 12:44:27PM -0400, Daniel P. Smith wrote: >> >> On 5/12/23 06:47, Matthew Garrett wrote: >>> On Thu, May 04, 2023 at 02:50:11PM +0000, Ross Philipson wrote: >>>> +Secure Launch does not interoperate with KASLR. If possible, the MLE should be >>>> +built with KASLR disabled:: >>> >>> Why does Secure Launch not interoperate with KASLR? >>> >>> Re: IOMMUs >> >> Until the IOMMU driver comes online, memory is protected by the PMRs regions >> requested by the Preamble (pre-launch code) in accordance with Intel TXT >> specifications and configured by the ACM. The KASLR randomizer will run >> before the IOMMU driver is able to come online and ensure frames used by the >> kernel are protected as well as frames that a driver may registered in a BAR >> are not blocked. > > This seems unfortunate. Presumably we're not able to modify the PMRs at > this point? This also seems like a potential issue for IOMMU config in > general - the presumption is that the firmware should be configuring the > IOMMU in such a way that DMA-capable devices can't attack the firmware > while we're in the boot environment, and if KASLR is leaving a window > there then it seems like we'd need to fix that? While unfortunate, it is a bit of the nature of the problem KASLR is attempting to address. If you know in advance where kernel pages are going to live and the frames that will be used for DMA, then have you not defeated the purpose of the randomization? As for the firmware use of the IOMMU, I am fairly certain those tables will get invalidated by the ACM when it is setting up the PMRs. >>>> +It is recommended that no other command line options should be set to override >>>> +the defaults above. >>> >>> What happens if they are? Does doing so change the security posture of >>> the system? If so, will the measurements be different in a way that >>> demonstrates the system is in an insecure state? >>> >> >> In an early version of the patch series this was enforced when turning on >> Secure Launch, but concerns were raised over this approach and was asked to >> allow the user to be able to shoot themselves in the foot. Overriding these >> values could render either an insecure state and/or an unstable system. > > If we're in an insecure state, is that something that would show up in > the form of different measurements? Yes, you would get a different measurement for the commandline. If you are thinking in terms of attestation, I would expect that the attestation measurement db would have a record for an acceptable commandline and would determine the system to be in an unknown state if it did not match. While the idea could be explored to create measurements based on configurations of kernel subsystems, this would likely entail instrumentation in those subsystems to assert a measurement to their configuration. Maybe IMA could cover something like this? It would definitely enable the ability to make deeper assessments about the state of a system, but I think this is out of the scope of what Secure Launch is attempting to do.
diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst index 6ed8d2f..fade37e 100644 --- a/Documentation/security/index.rst +++ b/Documentation/security/index.rst @@ -18,3 +18,4 @@ Security Documentation digsig landlock secrets/index + launch-integrity/index diff --git a/Documentation/security/launch-integrity/index.rst b/Documentation/security/launch-integrity/index.rst new file mode 100644 index 0000000..28eed91d --- /dev/null +++ b/Documentation/security/launch-integrity/index.rst @@ -0,0 +1,10 @@ +===================================== +System Launch Integrity documentation +===================================== + +.. toctree:: + + principles + secure_launch_overview + secure_launch_details + diff --git a/Documentation/security/launch-integrity/principles.rst b/Documentation/security/launch-integrity/principles.rst new file mode 100644 index 0000000..73cf063 --- /dev/null +++ b/Documentation/security/launch-integrity/principles.rst @@ -0,0 +1,313 @@ +======================= +System Launch Integrity +======================= + +This document serves to establish a common understanding of what is system +launch, the integrity concern for system launch, and why using a Root of Trust +(RoT) from a Dynamic Launch may be desired. Through out this document +terminology from the Trusted Computing Group (TCG) and National Institue for +Science and Technology (NIST) is used to ensure a vendor nutrual language is +used to describe and reference security-related concepts. + +System Launch +============= + +There is a tendency to only consider the classical power-on boot as the only +means to launch an Operating System (OS) on a computer system, but in fact most +modern processors support two methods to launch the system. To provide clarity a +common definition of a system launch should be established. This definition is +that a during a single power life cycle of a system, a System Launch consists +of an initialization event, typically in hardware, that is followed by an +executing software payload that takes the system from the initialized state to +a running state. Driven by the Trusted Computing Group (TCG) architecture, +modern processors are able to support two methods to launch a system, these two +types of system launch are known as Static Launch and Dynamic Launch. + +Static Launch +------------- + +Static launch is the system launch associated with the power cycle of the CPU. +Thus static launch refers to the classical power-on boot where the +initialization event is the release of the CPU from reset and the system +firmware is the software payload that brings the system up to a running state. +Since static launch is the system launch associated with the beginning of the +power lifecycle of a system, it is therefore a fixed, one-time system launch. +It is because of this that static launch is referred to and thought of as being +"static". + +Dynamic Launch +-------------- + +Modern CPUs architectures provides a mechanism to re-initialize the system to a +"known good" state without requiring a power event. This re-initialization +event is the event for a dynamic launch and is referred to as the Dynamic +Launch Event (DLE). The DLE functions by accepting a software payload, referred +to as the Dynamic Configuration Environment (DCE), that execution is handed to +after the DLE is invoked. The DCE is responsible for bringing the system back +to a running state. Since the dynamic launch is not tied to a power event like +the static launch, this enables a dynamic launch to be initiated at any time +and multiple times during a single power life cycle. This dynamism is the +reasoning behind referring to this system launch as being dynamic. + +Because a dynamic launch can be conducted at any time during a single power +life cycle, they are classified into one of two types, an early launch or a +late launch. + +:Early Launch: When a dynamic launch is used as a transition from a static + launch chain to the final Operating System. + +:Late Launch: The usage of a dynamic launch by an executing Operating System to + transition to a “known good” state to perform one or more operations, e.g. to + launch into a new Operating System. + +System Integrity +================ + +A computer system can be considered a collection of mechanisms that work +together to produce a result. The assurance that the mechanisms are functioning +correctly and producing the expected result is the integrity of the system. To +ensure a system's integrity there are a subset of these mechanisms, commonly +referred to as security mechanisms, that are present to help ensure the system +produces the expected result or at least detect the potential of an unexpected +result may have happened. Since the security mechanisms are relied upon to +ensue the integrity of the system, these mechanisms are trusted. Upon +inspection these security mechanisms each have a set of properties and these +properties can be evaluated to determine how susceptible a mechanism might be +to failure. This assessment is referred to as the Strength of Mechanism and for +trusted mechanism enables for the trustworthiness of that mechanism to be +quantified. + +For software systems there are two system states for which the integrity is +critical, when the software is loaded into memory and when the software is +executing on the hardware. Ensuring that the expected software is load into +memory is referred to as load-time integrity while ensuring that the software +executing is the expected software is the runtime integrity of that software. + +Load-time Integrity +------------------- + +It is critical to understand what load-time integrity establishes about a +system and what is assumed, i.e. what is being trusted. Load-time integrity is +when a trusted entity, i.e. an entity with an assumed integrity, takes an +action to assess an entity being loaded into memory before it is used. A +variety of mechanisms may be used to conduct the assessment, each with +different properties. A particular property is whether the mechanism creates an +evidence of the assessment. Often either cryptographic signature checking or +hashing are the common assessment operations used. + +A signature checking assessment functions by requiring a representation of the +accepted authorities and uses those representations to assess if the entity has +been signed by an accepted authority. The benefit to this process is that +assessment process includes an adjudication of the assessment. The drawbacks +are that 1) the adjudication is susceptible to tampering by the Trusted +Computing Base (TCB), 2) there is no evidence to assert that an untampered +adjudication was completed, and 3) the system must be an active participant in +the key management infrastructure. + +A cryptographic hashing assessment does not adjudicate the assessment but +instead generates evidence of the assessment to be adjudicated independently. +The benefits to this approach is that the assessment may be simple such that it +is able to be implemented as an immutable mechanism, e.g. in hardware. +Additionally it is possible for the adjudication to be conducted where it +cannot be tampered with by the TCB. The drawback is that a compromised +environment will be allowed to execute until an adjudication can be completed. + +Ultimately load-time integrity provides confidence that the correct entity was +loaded and in the absence of a run-time integrity mechanism assumes, i.e +trusts, that the entity will never become corrupted. + +Runtime Integrity +----------------- + +Runtime integrity in the general sense is when a trusted entity makes an +assessment of an entity at any point in time during the assessed entity's +execution. A more concrete explanation is the taking of an integrity assessment +of an active process executing on the system at any point during the process' +execution. Often the load-time integrity of an operating system's user-space, +i.e. the operating environment, is confused to be the runtime integrity of the +system since it is an integrity assessment of the "runtime" software. The +reality is that actual runtime integrity is a very difficult problem and thus +not very many solutions are public and/or available. One example of a runtime +integrity solution would be John Hopkins Advanced Physics Labratory's (APL) +Linux Kernel Integrity Module (LKIM). + +Trust Chains +============ + +Bulding upon the understanding of security mechanisms to establish load-time +integrity of an entity, it is possible to chain together load-time integrity +assessments to establish the integrity of the whole system. This process is +known as transitive trust and provides the concept of building a chain of +load-time integrity assessments, commonly referred to as a trust chain. These +assessments may be used to adjudicate the load-time integrity of the whole +system. This trust chain is started by a trusted entity that does the first +assessment. This first entity is referred to as the Root of Trust(RoT) with the +entities name being derived from the mechanism used for the assessment, i.e. +RoT for Verification (RTV) and RoT for Measurement (RTM). + +A trust chain is itself a mechanism, specifically a mechanism of mechanisms, +and therefore it too has a Strength of Mechanism. The factors that contribute +to a trust chain's strength are, + + - The strength of the chain's RoT + - The strength of each member of the trust chain + - The length, i.e. the number of members, of the chain + +Therefore to provide the strongest trust chains, they should start with a +strong RoT and should consist of members being of low complexity and minimizing +the number of members participating as is possible. In a more colloquial sense, +a trust chain is only as strong as it weakests link and more links increase +the probability of a weak link. + +Dynamic Launch Components +========================= + +The TCG architecture for dynamic launch is composed of a component series that +are used to setup and then carry out the launch. These components work together +to construct a RTM trust chain that is rooted in the dynamic launch and thus +commonly referred to as the Dynamic Root of Trust for Measurement (DRTM) chain. + +What follows is a brief explanation of each component in execution order. A +subset of these components are what establishes the dynamic launch's trust +chain. + +Dynamic Configuration Environment Preamble +------------------------------------------ + +The Dynamic Configuration Environment (DCE) Preamble is responsible for setting +up the system environment in preparation for a dynamic launch. The DCE Preamble +is not a part of the DRTM trust chain. + +Dynamic Launch Event +-------------------- + +The dynamic launch event is the event, typically a CPU instruction, that triggers +the system's dynamic launch mechanism to begin the launch. The dynamic launch +mechanism is also the RoT for the DRTM trust chain. + +Dynamic Configuration Environment +--------------------------------- + +The dynamic launch mechanism may have resulted in a reset of a portion of the +system. To bring the system back to an adequate state for system software the +dynamic launch will hand over control to the DCE. Prior to handing over this +control, the dynamic launch will measure the DCE. Once the DCE is complete it +will proceed to measure and then execute the Dynamic Launch Measured +Environment (DLME). + +Dynamic Launch Measured Environment +----------------------------------- + +The DLME is the first system kernel to have control of the system but may not +be the last. Depending on the usage and configuration, the DLME may be the +final/target operating system or it may be a boot loader that will load the +final/target operating system. + +Why DRTM +======== + +It is a fact that DRTM increases the load-time integrity of the system by +providing a trust chain that has an immutable hardware RoT, uses a limited +number of small, special purpose code to establish the trust chain that starts +the target operating system. As mentioned in the Trust Chain section, these are +the main three factors in driving up the strength of a trust chain. As can been +seen by the BootHole exploit, which in fact did not effect the integrity of +DRTM solutions, the sophistication of attacks targeting system launch is at an +all time high. There is no reason a system should not employ every integrity +measure hardware makes available. This is the crux of a defense-in-depth +approach to system security. In the past the now closed SMI gap was often +pointed to as invalidating DRTM, which in fact was nothing but a strawman +argument. As has continued to be demonstrated, if/when SMM is corrupted it can +always circumvent all load-time integrity, SRTM and DRTM, because it is a +run-time integrity problem. Regardless, Intel and AMD have both deployed +runtime integrity for SMI and SMM which is tied directly to DRTM such that this +perceived deficiency is now non-existent and the world is moving forward with +an expectation that DRTM must be present. + +Glossary +======== + +.. glossary:: + integrity + Guarding against improper information modification or destruction, and + includes ensuring information non-repudiation and authenticity. + + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm + + mechanism + A process or system that is used to produce a particular result. + + - NIST Special Publication 800-160 (VOLUME 1 ) - https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-160v1.pdf + + risk + A measure of the extent to which an entity is threatened by a potential + circumstance or event, and typically a function of: (i) the adverse impacts + that would arise if the circumstance or event occurs; and (ii) the + likelihood of occurrence. + + - NIST SP 800-30 Rev. 1 - https://nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-30r1.pdf + + security mechanism + A device or function designed to provide one or more security services + usually rated in terms of strength of service and assurance of the design. + + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm + + Strength of Mechanism + A scale for measuring the relative strength of a security mechanism + + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm + + transitive trust + Also known as "Inductive Trust", in this process a Root of Trust gives a + trustworthy description of a second group of functions. Based on this + description, an interested entity can determine the trust it is to place in + this second group of functions. If the interested entity determines that + the trust level of the second group of functions is acceptable, the trust + boundary is extended from the Root of Trust to include the second group of + functions. In this case, the process can be iterated. The second group of + functions can give a trustworthy description of the third group of + functions, etc. Transitive trust is used to provide a trustworthy + description of platform characteristics, and also to prove that + non-migratable keys are non-migratable + + - TCG Glossary - https://trustedcomputinggroup.org/wp-content/uploads/TCG-Glossary-V1.1-Rev-1.0.pdf + + trust + The confidence one element has in another that the second element will + behave as expected` + + - NISTIR 8320A - https://nvlpubs.nist.gov/nistpubs/ir/2021/NIST.IR.8320A.pdf + + trust anchor + An authoritative entity for which trust is assumed. + + - NIST SP 800-57 Part 1 Rev. 5 - https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-57pt1r5.pdf + + trusted + An element that another element relies upon to fulfill critical + requirements on its behalf. + + - NISTIR 8320A - https://nvlpubs.nist.gov/nistpubs/ir/2021/NIST.IR.8320A.pdf + + trusted computing base (TCB) + Totality of protection mechanisms within a computer system, including + hardware, firmware, and software, the combination responsible for enforcing + a security policy. + + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm + + trusted computer system + A system that has the necessary security functions and assurance that the + security policy will be enforced and that can process a range of + information sensitivities (i.e. classified, controlled unclassified + information (CUI), or unclassified public information) simultaneously. + + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm + + trustworthiness + The attribute of a person or enterprise that provides confidence to others + of the qualifications, capabilities, and reliability of that entity to + perform specific tasks and fulfill assigned responsibilities. + + - NIST CNSSI No. 4009 - https://www.cnss.gov/CNSS/issuances/Instructions.cfm diff --git a/Documentation/security/launch-integrity/secure_launch_details.rst b/Documentation/security/launch-integrity/secure_launch_details.rst new file mode 100644 index 0000000..2e71543 --- /dev/null +++ b/Documentation/security/launch-integrity/secure_launch_details.rst @@ -0,0 +1,564 @@ +=================================== +Secure Launch Config and Interfaces +=================================== + +Configuration +============= + +The settings to enable Secure Launch using Kconfig are under:: + + "Processor type and features" --> "Secure Launch support" + +A kernel with this option enabled can still be booted using other supported +methods. + +To reduce the Trusted Computing Base (TCB) of the MLE [1]_, the build +configuration should be pared down as narrowly as one's use case allows. +The fewer drivers (less active hardware) and features reduces the attack +surface. E.g. in the extreme, the MLE could only have local disk access +and no other hardware support. Or only network access for remote attestation. + +It is also desirable if possible to embed the initrd used with the MLE kernel +image to reduce complexity. + +The following are a few important configuration necessities to always consider: + +KASLR Configuration +------------------- + +Secure Launch does not interoperate with KASLR. If possible, the MLE should be +built with KASLR disabled:: + + "Processor type and features" --> + "Build a relocatable kernel" --> + "Randomize the address of the kernel image (KASLR) [ ]" + +This unsets the Kconfig value CONFIG_RANDOMIZE_BASE. + +If not possible, KASLR must be disabled on the kernel command line when doing +a Secure Launch as follows:: + + nokaslr + +IOMMU Configuration +------------------- + +When doing a Secure Launch, the IOMMU should always be enabled and the drivers +loaded. However, IOMMU passthrough mode should never be used. This leaves the +MLE completely exposed to DMA after the PMR's [2]_ are disabled. The current default +mode is to use IOMMU in lazy translated mode but strict translated mode is the preferred +IOMMU mode and this should be selected in the build configuration:: + + "Device Drivers" --> + "IOMMU Hardware Support" --> + "IOMMU default domain type" --> + "(X) Translated - Strict" + +In addition, the Intel IOMMU should be on by default. The following sets this as the +default in the build configuration:: + + "Device Drivers" --> + "IOMMU Hardware Support" --> + "Support for Intel IOMMU using DMA Remapping Devices [*]" + +and:: + + "Device Drivers" --> + "IOMMU Hardware Support" --> + "Support for Intel IOMMU using DMA Remapping Devices [*]" --> + "Enable Intel DMA Remapping Devices by default [*]" + +It is recommended that no other command line options should be set to override +the defaults above. + +Secure Launch Resource Table +============================ + +The Secure Launch Resource Table (SLRT) is a platform-agnostic, standard format +for providing information for the pre-launch environment and to pass +information to the post-launch environment. The table is populated by one or +more bootloaders in the boot chain and used by Secure Launch on how to setup +the environment during post-launch. The details for the SLRT are documented +in the TrenchBoot Secure Launch Specifcation [3]_. + +Intel TXT Interface +=================== + +The primary interfaces between the various components in TXT are the TXT MMIO +registers and the TXT heap. The MMIO register banks are described in Appendix B +of the TXT MLE [1]_ Development Guide. + +The TXT heap is described in Appendix C of the TXT MLE [1]_ Development +Guide. Most of the TXT heap is predefined in the specification. The heap is +initialized by firmware and the pre-launch environment and is subsequently used +by the SINIT ACM. One section, called the OS to MLE Data Table, is reserved for +software to define. This table is set up per the recommendation detailed in +Appendix B of the TrenchBoot Secure Launch Specification:: + + /* + * Secure Launch defined OS/MLE TXT Heap table + */ + struct txt_os_mle_data { + u32 version; + u32 boot_params_addr; + struct slr_table *slrt; + u64 txt_info; + u32 ap_wake_block; + u32 ap_wake_block_size; + u8 mle_scratch[64]; + } __packed; + +Description of structure: + +===================== ======================================================================== +Field Use +===================== ======================================================================== +version Structure version, current value 1 +boot_params_addr Physical base address of the Linux boot parameters +slrt Physical address of the Secure Launch Resource Table +txt_info Pointer into the SLRT for easily locating TXT specific table +ap_wake_block Physical address of the block of memory for parking APs after a launch +ap_wake_block_size Size of the AP wake block +mle_scratch Scratch area used post-launch by the MLE kernel. Fields: + + - SL_SCRATCH_AP_EBX area to share %ebx base pointer among CPUs + - SL_SCRATCH_AP_JMP_OFFSET offset to abs. ljmp fixup location for APs +===================== ======================================================================== + +Error Codes +----------- + +The TXT specification defines the layout for TXT 32 bit error code values. +The bit encodings indicate where the error originated (e.g. with the CPU, +in the SINIT ACM, in software). The error is written to a sticky TXT +register that persists across resets called TXT.ERRORCODE (see the TXT +MLE Development Guide). The errors defined by the Secure Launch feature are +those generated in the MLE software. They have the format:: + + 0xc0008XXX + +The low 12 bits are free for defining the following Secure Launch specific +error codes. + +====== ================ +Name: SL_ERROR_GENERIC +Value: 0xc0008001 +====== ================ + +Description: + +Generic catch all error. Currently unused. + +====== ================= +Name: SL_ERROR_TPM_INIT +Value: 0xc0008002 +====== ================= + +Description: + +The Secure Launch code failed to get an access to the TPM hardware interface. +This is most likely to due to misconfigured hardware or kernel. Ensure the +TPM chip is enabled and the kernel TPM support is built in (it should not be +built as a module). + +====== ========================== +Name: SL_ERROR_TPM_INVALID_LOG20 +Value: 0xc0008003 +====== ========================== + +Description: + +The Secure Launch code failed to find a valid event log descriptor for TPM +version 2.0 or the event log descriptor is malformed. Usually this indicates +that incompatible versions of the pre-launch environment and the MLE kernel. +The pre-launch environment and the kernel share a structure in the TXT heap and +if this structure (the OS-MLE table) is mismatched, this error is often seen. +This TXT heap area is setup by the pre-launch environment so the issue may +originate there. It could be the sign of an attempted attack. + +====== =========================== +Name: SL_ERROR_TPM_LOGGING_FAILED +Value: 0xc0008004 +====== =========================== + +Description: + +There was a failed attempt to write a TPM event to the event log early in the +Secure Launch process. This is likely the result of a malformed TPM event log +buffer. Formatting of the event log buffer information is done by the +pre-launch environment so the issue most likely originates there. + +====== ============================ +Name: SL_ERROR_REGION_STRADDLE_4GB +Value: 0xc0008005 +====== ============================ + +Description: + +During early validation a buffer or region was found to straddle the 4GB +boundary. Because of the way TXT does DMA memory protection, this is an +unsafe configuration and is flagged as an error. This is most likely a +configuration issue in the pre-launch environment. It could also be the sign of +an attempted attack. + +====== =================== +Name: SL_ERROR_TPM_EXTEND +Value: 0xc0008006 +====== =================== + +Description: + +There was a failed attempt to extend a TPM PCR in the Secure Launch platform +module. This is most likely to due to misconfigured hardware or kernel. Ensure +the TPM chip is enabled and the kernel TPM support is built in (it should not +be built as a module). + +====== ====================== +Name: SL_ERROR_MTRR_INV_VCNT +Value: 0xc0008007 +====== ====================== + +Description: + +During early Secure Launch validation an invalid variable MTRR count was found. +The pre-launch environment passes a number of MSR values to the MLE to restore +including the MTRRs. The values are restored by the Secure Launch early entry +point code. After measuring the values supplied by the pre-launch environment, +a discrepancy was found validating the values. It could be the sign of an +attempted attack. + +====== ========================== +Name: SL_ERROR_MTRR_INV_DEF_TYPE +Value: 0xc0008008 +====== ========================== + +Description: + +During early Secure Launch validation an invalid default MTRR type was found. +See SL_ERROR_MTRR_INV_VCNT for more details. + +====== ====================== +Name: SL_ERROR_MTRR_INV_BASE +Value: 0xc0008009 +====== ====================== + +Description: + +During early Secure Launch validation an invalid variable MTRR base value was +found. See SL_ERROR_MTRR_INV_VCNT for more details. + +====== ====================== +Name: SL_ERROR_MTRR_INV_MASK +Value: 0xc000800a +====== ====================== + +Description: + +During early Secure Launch validation an invalid variable MTRR mask value was +found. See SL_ERROR_MTRR_INV_VCNT for more details. + +====== ======================== +Name: SL_ERROR_MSR_INV_MISC_EN +Value: 0xc000800b +====== ======================== + +Description: + +During early Secure Launch validation an invalid miscellaneous enable MSR value +was found. See SL_ERROR_MTRR_INV_VCNT for more details. + +====== ========================= +Name: SL_ERROR_INV_AP_INTERRUPT +Value: 0xc000800c +====== ========================= + +Description: + +The application processors (APs) wait to be woken up by the SMP initialization +code. The only interrupt that they expect is an NMI; all other interrupts +should be masked. If an AP gets some other interrupt other than an NMI it will +cause this error. This error is very unlikely to occur. + +====== ========================= +Name: SL_ERROR_INTEGER_OVERFLOW +Value: 0xc000800d +====== ========================= + +Description: + +A buffer base and size passed to the MLE caused an integer overflow when +added together. This is most likely a configuration issue in the pre-launch +environment. It could also be the sign of an attempted attack. + +====== ================== +Name: SL_ERROR_HEAP_WALK +Value: 0xc000800e +====== ================== + +Description: + +An error occurred in TXT heap walking code. The underlying issue is a failure to +early_memremap() portions of the heap, most likely due to a resource shortage. + +====== ================= +Name: SL_ERROR_HEAP_MAP +Value: 0xc000800f +====== ================= + +Description: + +This error is essentially the same as SL_ERROR_HEAP_WALK but occurred during the +actual early_memremap() operation. + +====== ========================= +Name: SL_ERROR_REGION_ABOVE_4GB +Value: 0xc0008010 +====== ========================= + +Description: + +A memory region used by the MLE is above 4GB. In general this is not a problem +because memory > 4Gb can be protected from DMA. There are certain buffers that +should never be above 4Gb though and one of these caused the violation. This is +most likely a configuration issue in the pre-launch environment. It could also +be the sign of an attempted attack. + +====== ========================== +Name: SL_ERROR_HEAP_INVALID_DMAR +Value: 0xc0008011 +====== ========================== + +Description: + +The backup copy of the ACPI DMAR table which is supposed to be located in the +TXT heap could not be found. This is due to a bug in the platform's ACM module +or in firmware. + +====== ======================= +Name: SL_ERROR_HEAP_DMAR_SIZE +Value: 0xc0008012 +====== ======================= + +Description: + +The backup copy of the ACPI DMAR table in the TXT heap is to large to be stored +for later usage. This error is very unlikely to occur since the area reserved +for the copy is far larger than the DMAR should be. + +====== ====================== +Name: SL_ERROR_HEAP_DMAR_MAP +Value: 0xc0008013 +====== ====================== + +Description: + +The backup copy of the ACPI DMAR table in the TXT heap could not be mapped. The +underlying issue is a failure to early_memremap() the DMAR table, most likely +due to a resource shortage. + +====== ==================== +Name: SL_ERROR_HI_PMR_BASE +Value: 0xc0008014 +====== ==================== + +Description: + +On a system with more than 4G of RAM, the high PMR [2]_ base address should be set +to 4G. This error is due to that not being the case. This PMR value is set by +the pre-launch environment so the issue most likely originates there. It could also +be the sign of an attempted attack. + +====== ==================== +Name: SL_ERROR_HI_PMR_SIZE +Value: 0xc0008015 +====== ==================== + +Description: + +On a system with more than 4G of RAM, the high PMR [2]_ size should be set to cover +all RAM > 4G. This error is due to that not being the case. This PMR value is +set by the pre-launch environment so the issue most likely originates there. It +could also be the sign of an attempted attack. + +====== ==================== +Name: SL_ERROR_LO_PMR_BASE +Value: 0xc0008016 +====== ==================== + +Description: + +The low PMR [2]_ base should always be set to address zero. This error is due to +that not being the case. This PMR value is set by the pre-launch environment +so the issue most likely originates there. It could also be the sign of an attempted +attack. + +====== ==================== +Name: SL_ERROR_LO_PMR_MLE +Value: 0xc0008017 +====== ==================== + +Description: + +This error indicates the MLE image is not covered by the low PMR [2]_ range. The +PMR values are set by the pre-launch environment so the issue most likely originates +there. It could also be the sign of an attempted attack. + +====== ======================= +Name: SL_ERROR_INITRD_TOO_BIG +Value: 0xc0008018 +====== ======================= + +Description: + +The external initrd provided is larger than 4Gb. This is not a valid +configuration for a Secure Launch due to managing DMA protection. + +====== ========================= +Name: SL_ERROR_HEAP_ZERO_OFFSET +Value: 0xc0008019 +====== ========================= + +Description: + +During a TXT heap walk an invalid/zero next table offset value was found. This +indicates the TXT heap is malformed. The TXT heap is initialized by the +pre-launch environment so the issue most likely originates there. It could also +be a sign of an attempted attack. In addition, ACM is also responsible for +manipulating parts of the TXT heap so the issue could be due to a bug in the +platform's ACM module. + +====== ============================= +Name: SL_ERROR_WAKE_BLOCK_TOO_SMALL +Value: 0xc000801a +====== ============================= + +Description: + +The AP wake block buffer passed to the MLE via the OS-MLE TXT heap table is not +large enough. This value is set by the pre-launch environment so the issue most +likely originates there. It also could be the sign of an attempted attack. + +====== =========================== +Name: SL_ERROR_MLE_BUFFER_OVERLAP +Value: 0xc000801b +====== =========================== + +Description: + +One of the buffers passed to the MLE via the OS-MLE TXT heap table overlaps +with the MLE image in memory. This value is set by the pre-launch environment +so the issue most likely originates there. It could also be the sign of an attempted +attack. + +====== ========================== +Name: SL_ERROR_BUFFER_BEYOND_PMR +Value: 0xc000801c +====== ========================== + +Description: + +One of the buffers passed to the MLE via the OS-MLE TXT heap table is not +protected by a PMR. This value is set by the pre-launch environment so the +issue most likey originates there. It could also be the sign of an attempted +attack. + +====== ============================= +Name: SL_ERROR_OS_SINIT_BAD_VERSION +Value: 0xc000801d +====== ============================= + +Description: + +The version of the OS-SINIT TXT heap table is bad. It must be 6 or greater. +This value is set by the pre-launch environment so the issue most likely +originates there. It could also be the sign of an attempted attack. It is also +possible though very unlikely that the platform is so old that the ACM being +used requires an unsupported version. + +====== ===================== +Name: SL_ERROR_EVENTLOG_MAP +Value: 0xc000801e +====== ===================== + +Description: + +An error occurred in the Secure Launch module while mapping the TPM event log. +The underlying issue is memremap() failure, most likely due to a resource +shortage. + +====== ======================== +Name: SL_ERROR_TPM_NUMBER_ALGS +Value: 0xc000801f +====== ======================== + +Description: + +The TPM 2.0 event log reports an unsupported number of hashing algorithms. +Secure launch currently only supports a maximum of two: SHA1 and SHA256. + +====== =========================== +Name: SL_ERROR_TPM_UNKNOWN_DIGEST +Value: 0xc0008020 +====== =========================== + +Description: + +The TPM 2.0 event log reports an unsupported hashing algorithm. Secure launch +currently only supports two algorithms: SHA1 and SHA256. + +====== ========================== +Name: SL_ERROR_TPM_INVALID_EVENT +Value: 0xc0008021 +====== ========================== + +Description: + +An invalid/malformed event was found in the TPM event log while reading it. +Since only trusted entities are supposed to be writing the event log, this +would indicate either a bug or a possible attack. + +====== ===================== +Name: SL_ERROR_INVALID_SLRT +Value: 0xc0008022 +====== ===================== + +Description: + +The Secure Launch Resource Table is invalid or malformed and is unusable. +This implies the pre-launch code did not properly setup the SLRT. + +====== =========================== +Name: SL_ERROR_SLRT_MISSING_ENTRY +Value: 0xc0008023 +====== =========================== + +Description: + +The Secure Launch Resource Table is missing a required entry within it. +This implies the pre-launch code did not properly setup the SLRT. + +====== ================= +Name: SL_ERROR_SLRT_MAP +Value: 0xc0008024 +====== ================= + +Description: + +An error occurred in the Secure Launch module while mapping the Secure Launch +Resource table. The underlying issue is memremap() failure, most likely due to +a resource shortage. + +.. [1] + MLE: Measured Launch Environment is the binary runtime that is measured and + then run by the TXT SINIT ACM. The TXT MLE Development Guide describes the + requirements for the MLE in detail. + +.. [2] + PMR: Intel VTd has a feature in the IOMMU called Protected Memory Registers. + There are two of these registers and they allow all DMA to be blocked + to large areas of memory. The low PMR can cover all memory below 4Gb on 2Mb + boundaries. The high PMR can cover all RAM on the system, again on 2Mb + boundaries. This feature is used during a Secure Launch by TXT. + +.. [3] + Secure Launch Specification: https://trenchboot.org/specifications/Secure_Launch/ diff --git a/Documentation/security/launch-integrity/secure_launch_overview.rst b/Documentation/security/launch-integrity/secure_launch_overview.rst new file mode 100644 index 0000000..ba91d73 --- /dev/null +++ b/Documentation/security/launch-integrity/secure_launch_overview.rst @@ -0,0 +1,220 @@ +====================== +Secure Launch Overview +====================== + +Overview +======== + +Prior to the start of the TrenchBoot project, the only active Open Source +project supporting dynamic launch was Intel's tboot project to support their +implementation of dynamic launch known as Intel Trusted eXecution Technology +(TXT). The approach taken by tboot was to provide an exokernel that could +handle the launch protocol implemented by Intel's special loader, the SINIT +Authenticated Code Module (ACM [2]_) and remained in memory to manage the SMX +CPU mode that a dynamic launch would put a system. While it is not precluded +from being used for doing a late launch, tboot's primary use case was to be +used as an early launch solution. As a result the TrenchBoot project started +the development of Secure Launch kernel feature to provide a more generalized +approach. The focus of the effort is twofold, the first is to make the Linux +kernel directly aware of the launch protocol used by Intel, AMD/Hygon, Arm, and +potentially OpenPOWER. The second is to make the Linux kernel be able to +initiate a dynamic launch. It is through this approach that the Secure Launch +kernel feature creates a basis for the Linux kernel to be used in a variety of +dynamic launch use cases. + +.. note:: + A quick note on terminology. The larger open source project itself is + called TrenchBoot, which is hosted on GitHub (links below). The kernel + feature enabling the use of the x86 technology is referred to as "Secure + Launch" within the kernel code. + +Goals +===== + +The first use case that the TrenchBoot project focused on was the ability for +the Linux kernel to be started by a dynamic launch, in particular as part of an +early launch sequence. In this case the dynamic launch will be initiated by any +boot loader with associated support added to it, for example the first targeted +boot loader in this case was GRUB2. An integral part of establishing a +measurement-based launch integrity involves measuring everything that is +intended to be executed (kernel image, initrd, etc) and everything that will +configure that kernel to execute (command line, boot params, etc). Then storing +those measurements in a protected manner. Both the Intel and AMD dynamic launch +implementations leverage the Trusted Platform Module (TPM) to store those +measurements. The TPM itself has been designed such that a dynamic launch +unlocks a specific set of Platform Configuration Registers (PCR) for holding +measurement taken during the dynamic launch. These are referred to as the DRTM +PCRs, PCRs 17-22. Further details on this process can be found in the +documentation for the GETSEC instruction provided by Intel's TXT and the SKINIT +instruction provided by AMD's AMD-V. The documentation on these technologies +can be readily found online; see the `Resources`_ section below for references. + +.. note:: + Currently only Intel TXT is supported in this first release of the Secure + Launch feature. AMD/Hygon SKINIT and Arm support will be added in a + subsequent release. + +To enable the kernel to be launched by GETSEC a stub, the Secure Launch stub, +must be built into the setup section of the compressed kernel to handle the +specific state that the dynamic launch process leaves the BSP. Also the Secure +Launch stub must measure everything that is going to be used as early as +possible. This stub code and subsequent code must also deal with the specific +state that the dynamic launch leaves the APs as well. + +Design Decisions +================ + +A number of design decisions were made during the development of the Secure +Launch feature. The two primary guiding decisions were: + + - Keeping the Secure Launch code as separate from the rest of the kernel + as possible. + - Modifying the existing boot path of the kernel as little as possible. + +The following illustrate how the implementation followed these design +decisions: + + - All the entry point code necessary to properly configure the system post + launch is found in st_stub.S in the compressed kernel image. This code + validates the state of the system, restores necessary system operating + configurations and properly handles post launch CPU states. + - After the sl_stub.S is complete, it jumps directly to the unmodified + startup_32 kernel entry point. + - A single call is made to a function sl_main() prior to the main kernel + decompression step. This code performs further validation and takes the + needed DRTM measurements. + - After the call to sl_main(), the main kernel is decompressed and boots as + it normally would. + - Final setup for the Secure Launch kernel is done in a separate Secure + Launch module that is loaded via a late initcall. This code is responsible + for extending the measurements taken earlier into the TPM DRTM PCRs and + setting up the securityfs interface to allow access the TPM event log and + public TXT registers. + - On the reboot and kexec paths, calls are made to a function to finalize the + state of the Secure Launch kernel. + +The one place where Secure Launch code is mixed directly in with kernel code is +in the SMP boot code. This is due to the unique state that the dynamic launch +leaves the APs in. On Intel this involves using a method other than the +standard INIT-SIPI sequence. + +A final note is that originally the extending of the PCRs was completed in the +Secure Launch stub when the measurements were taken. An alternative solution +had to be implemented due to the TPM maintainers objecting to the PCR +extensions being done with a minimal interface to the TPM that was an +independent implementation of the mainline kernel driver. Since the mainline +driver relies heavily on kernel interfaces not available in the compressed +kernel, it was not possible to reuse the mainline TPM driver. This resulted in +the decision to move the extension operations to the Secure Launch module in +the mainline kernel where the TPM driver would be available. + +Basic Boot Flow +=============== + +Outlined here is summary of the boot flow for Secure Launch. A more detailed +review of Secure Launch process can be found in the Secure Launch +Specification, a link is located in the `Resources`_ section. + +Pre-launch: *Phase where the environment is prepared and configured to initiate the +secure launch by the boot chain.* + + - The SLRT is initialized and dl_stub is placed in memory. + - Load the kernel, initrd and ACM [2]_ into memory. + - Setup the TXT heap and page tables describing the MLE [1]_ per the + specification. + - If non-UEFI platform, dl_stub is called. + - If UEFI platforms, SLRT registered with UEFI and efi-stub called. + - Upon completion, efi-stub will call EBS followed by dl_stub. + - The dl_stub will prepare the CPU and the TPM for the launch. + - The secure launch is then initiated with the GETSET[SENTER] instruction. + +Post-launch: *Phase where control is passed from the ACM to the MLE and the secure +kernel begins execution.* + + - Entry from the dynamic launch jumps to the SL stub. + - SL stub fixes up the world on the BSP. + - For TXT, SL stub wakes the APs, fixes up their worlds. + - For TXT, APs are left halted waiting for an NMI to wake them. + - SL stub jumps to startup_32. + - SL main does validation of buffers and memory locations. It sets + the boot parameter loadflag value SLAUNCH_FLAG to inform the main + kernel that a Secure Launch was done. + - SL main locates the TPM event log and writes the measurements of + configuration and module information into it. + - Kernel boot proceeds normally from this point. + - During early setup, slaunch_setup() runs to finish some validation + and setup tasks. + - The SMP bring up code is modified to wake the waiting APs. APs vector + to rmpiggy and start up normally from that point. + - SL platform module is registered as a late initcall module. It reads + the TPM event log and extends the measurements taken into the TPM PCRs. + - SL platform module initializes the securityfs interface to allow + access to the TPM event log and TXT public registers. + - Kernel boot finishes booting normally + - SEXIT support to leave SMX mode is present on the kexec path and + the various reboot paths (poweroff, reset, halt). + +PCR Usage +========= + +The TCG DRTM architecture there are three PCRs defined for usage, PCR.Details +(PCR17), PCR.Authorities (PCR18), and PCR.DLME_Authority (PCR19). For a deeper +understanding of Detail and Authorities it is recommended to review the TCG +DRTM architecture. + +To determine PCR usage, Linux Secure Launch follows the TrenchBoot Secure +Launch Specification of using a measurement policy stored in the SLRT. The +policy details what should be measured and the PCR in which to store the +measurement. The measurement policy provides the ability to select the +PCR.DLME_Detail (PCR20) PCR as the location for the DRTM components measured by +the kernel, e.g. external initrd image. This can then be combined with storing +the user authority in the PCR.DLME_Authority PCR to seal/attest to different +variations of platform details/authorities and user details/authorities. An +example of how this can be achieved was presented in the FOSDEM - 2021 talk +"Secure Upgrades with DRTM". + +Resources +========= + +The TrenchBoot project: + +https://trenchboot.org + +Secure Launch Specification: + +https://trenchboot.org/specifications/Secure_Launch/ + +Trusted Computing Group's D-RTM Architecture: + +https://trustedcomputinggroup.org/wp-content/uploads/TCG_D-RTM_Architecture_v1-0_Published_06172013.pdf + +TXT documentation in the Intel TXT MLE Development Guide: + +https://www.intel.com/content/dam/www/public/us/en/documents/guides/intel-txt-software-development-guide.pdf + +TXT instructions documentation in the Intel SDM Instruction Set volume: + +https://software.intel.com/en-us/articles/intel-sdm + +AMD SKINIT documentation in the System Programming manual: + +https://www.amd.com/system/files/TechDocs/24593.pdf + +GRUB Secure Launch support: + +https://github.com/TrenchBoot/grub/tree/grub-sl-fc-38-dlstub + +FOSDEM 2021: Secure Upgrades with DRTM + +https://archive.fosdem.org/2021/schedule/event/firmware_suwd/ + +.. [1] + MLE: Measured Launch Environment is the binary runtime that is measured and + then run by the TXT SINIT ACM. The TXT MLE Development Guide describes the + requirements for the MLE in detail. + +.. [2] + ACM: Intel's Authenticated Code Module. This is the 32b bit binary blob that + is run securely by the GETSEC[SENTER] during a measured launch. It is described + in the Intel documentation on TXT and versions for various chipsets are + signed and distributed by Intel.