diff options
author | nkozlovskiy <nmk@ydb.tech> | 2023-10-11 19:11:46 +0300 |
---|---|---|
committer | nkozlovskiy <nmk@ydb.tech> | 2023-10-11 19:33:28 +0300 |
commit | 61b3971447e473726d6cdb23fc298e457b4d973c (patch) | |
tree | e2a2a864bb7717f7ae6138f6a3194a254dd2c7bb /contrib/libs/clang14-rt/lib/dfsan | |
parent | a674dc57d88d43c2e8e90a6084d5d2c988e0402c (diff) | |
download | ydb-61b3971447e473726d6cdb23fc298e457b4d973c.tar.gz |
add sanitizers dependencies
Diffstat (limited to 'contrib/libs/clang14-rt/lib/dfsan')
17 files changed, 5484 insertions, 0 deletions
diff --git a/contrib/libs/clang14-rt/lib/dfsan/.yandex_meta/licenses.list.txt b/contrib/libs/clang14-rt/lib/dfsan/.yandex_meta/licenses.list.txt new file mode 100644 index 0000000000..591a53066f --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/.yandex_meta/licenses.list.txt @@ -0,0 +1,366 @@ +====================Apache-2.0==================== + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + + +====================Apache-2.0 WITH LLVM-exception==================== +---- LLVM Exceptions to the Apache 2.0 License ---- + +As an exception, if, as a result of your compiling your source code, portions +of this Software are embedded into an Object form of such source code, you +may redistribute such embedded portions in such Object form without complying +with the conditions of Sections 4(a), 4(b) and 4(d) of the License. + +In addition, if you combine or link compiled forms of this Software with +software that is licensed under the GPLv2 ("Combined Software") and if a +court of competent jurisdiction determines that the patent provision (Section +3), the indemnity provision (Section 9) or other Section of the License +conflicts with the conditions of the GPLv2, you may retroactively and +prospectively choose to deem waived or otherwise exclude such Section(s) of +the License, but only in their entirety and only with respect to the Combined +Software. + + +====================Apache-2.0 WITH LLVM-exception==================== +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. + + +====================Apache-2.0 WITH LLVM-exception==================== +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception + + +====================Apache-2.0 WITH LLVM-exception==================== +The LLVM Project is under the Apache License v2.0 with LLVM Exceptions: + + +====================Apache-2.0 WITH LLVM-exception==================== +|* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +|* See https://llvm.org/LICENSE.txt for license information. + + +====================Apache-2.0 WITH LLVM-exception==================== +|* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception + + +====================COPYRIGHT==================== + InitCache(c); + TransferBatch *b = allocator->AllocateBatch(&stats_, this, class_id); + if (UNLIKELY(!b)) + + +====================COPYRIGHT==================== +Copyright (c) 2009-2015 by the contributors listed in CREDITS.TXT + + +====================COPYRIGHT==================== +Copyright (c) 2009-2019 by the contributors listed in CREDITS.TXT + + +====================File: CREDITS.TXT==================== +This file is a partial list of people who have contributed to the LLVM/CompilerRT +project. If you have contributed a patch or made some other contribution to +LLVM/CompilerRT, please submit a patch to this file to add yourself, and it will be +done! + +The list is sorted by surname and formatted to allow easy grepping and +beautification by scripts. The fields are: name (N), email (E), web-address +(W), PGP key ID and fingerprint (P), description (D), and snail-mail address +(S). + +N: Craig van Vliet +E: cvanvliet@auroraux.org +W: http://www.auroraux.org +D: Code style and Readability fixes. + +N: Edward O'Callaghan +E: eocallaghan@auroraux.org +W: http://www.auroraux.org +D: CMake'ify Compiler-RT build system +D: Maintain Solaris & AuroraUX ports of Compiler-RT + +N: Howard Hinnant +E: hhinnant@apple.com +D: Architect and primary author of compiler-rt + +N: Guan-Hong Liu +E: koviankevin@hotmail.com +D: IEEE Quad-precision functions + +N: Joerg Sonnenberger +E: joerg@NetBSD.org +D: Maintains NetBSD port. + +N: Matt Thomas +E: matt@NetBSD.org +D: ARM improvements. + + +====================MIT==================== +Permission is hereby granted, free of charge, to any person obtaining a copy +of this software and associated documentation files (the "Software"), to deal +in the Software without restriction, including without limitation the rights +to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +copies of the Software, and to permit persons to whom the Software is +furnished to do so, subject to the following conditions: + +The above copyright notice and this permission notice shall be included in +all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +THE SOFTWARE. + +====================NCSA==================== +Compiler-RT is open source software. You may freely distribute it under the +terms of the license agreement found in LICENSE.txt. + + +====================NCSA==================== +Legacy LLVM License (https://llvm.org/docs/DeveloperPolicy.html#legacy): + + +====================NCSA==================== +Permission is hereby granted, free of charge, to any person obtaining a copy of +this software and associated documentation files (the "Software"), to deal with +the Software without restriction, including without limitation the rights to +use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies +of the Software, and to permit persons to whom the Software is furnished to do +so, subject to the following conditions: + + * Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimers. + + * Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimers in the + documentation and/or other materials provided with the distribution. + + * Neither the names of the LLVM Team, University of Illinois at + Urbana-Champaign, nor the names of its contributors may be used to + endorse or promote products derived from this Software without specific + prior written permission. + +THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS +FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS WITH THE +SOFTWARE. + + +====================NCSA==================== +University of Illinois/NCSA +Open Source License + + +====================NCSA AND MIT==================== +The compiler_rt library is dual licensed under both the University of Illinois +"BSD-Like" license and the MIT license. As a user of this code you may choose +to use it under either license. As a contributor, you agree to allow your code +to be used under both. + +Full text of the relevant licenses is included below. diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan.cpp b/contrib/libs/clang14-rt/lib/dfsan/dfsan.cpp new file mode 100644 index 0000000000..afb01c7d88 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan.cpp @@ -0,0 +1,1127 @@ +//===-- dfsan.cpp ---------------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +// DataFlowSanitizer runtime. This file defines the public interface to +// DataFlowSanitizer as well as the definition of certain runtime functions +// called automatically by the compiler (specifically the instrumentation pass +// in llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp). +// +// The public interface is defined in include/sanitizer/dfsan_interface.h whose +// functions are prefixed dfsan_ while the compiler interface functions are +// prefixed __dfsan_. +//===----------------------------------------------------------------------===// + +#include "dfsan/dfsan.h" + +#include "dfsan/dfsan_chained_origin_depot.h" +#include "dfsan/dfsan_flags.h" +#include "dfsan/dfsan_origin.h" +#include "dfsan/dfsan_thread.h" +#include "sanitizer_common/sanitizer_atomic.h" +#include "sanitizer_common/sanitizer_common.h" +#include "sanitizer_common/sanitizer_file.h" +#include "sanitizer_common/sanitizer_flag_parser.h" +#include "sanitizer_common/sanitizer_flags.h" +#include "sanitizer_common/sanitizer_internal_defs.h" +#include "sanitizer_common/sanitizer_libc.h" +#include "sanitizer_common/sanitizer_report_decorator.h" +#include "sanitizer_common/sanitizer_stacktrace.h" + +using namespace __dfsan; + +Flags __dfsan::flags_data; + +// The size of TLS variables. These constants must be kept in sync with the ones +// in DataFlowSanitizer.cpp. +static const int kDFsanArgTlsSize = 800; +static const int kDFsanRetvalTlsSize = 800; +static const int kDFsanArgOriginTlsSize = 800; + +SANITIZER_INTERFACE_ATTRIBUTE THREADLOCAL u64 + __dfsan_retval_tls[kDFsanRetvalTlsSize / sizeof(u64)]; +SANITIZER_INTERFACE_ATTRIBUTE THREADLOCAL u32 __dfsan_retval_origin_tls; +SANITIZER_INTERFACE_ATTRIBUTE THREADLOCAL u64 + __dfsan_arg_tls[kDFsanArgTlsSize / sizeof(u64)]; +SANITIZER_INTERFACE_ATTRIBUTE THREADLOCAL u32 + __dfsan_arg_origin_tls[kDFsanArgOriginTlsSize / sizeof(u32)]; + +// Instrumented code may set this value in terms of -dfsan-track-origins. +// * undefined or 0: do not track origins. +// * 1: track origins at memory store operations. +// * 2: track origins at memory load and store operations. +// TODO: track callsites. +extern "C" SANITIZER_WEAK_ATTRIBUTE const int __dfsan_track_origins; + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE int dfsan_get_track_origins() { + return &__dfsan_track_origins ? __dfsan_track_origins : 0; +} + +// On Linux/x86_64, memory is laid out as follows: +// +// +--------------------+ 0x800000000000 (top of memory) +// | application 3 | +// +--------------------+ 0x700000000000 +// | invalid | +// +--------------------+ 0x610000000000 +// | origin 1 | +// +--------------------+ 0x600000000000 +// | application 2 | +// +--------------------+ 0x510000000000 +// | shadow 1 | +// +--------------------+ 0x500000000000 +// | invalid | +// +--------------------+ 0x400000000000 +// | origin 3 | +// +--------------------+ 0x300000000000 +// | shadow 3 | +// +--------------------+ 0x200000000000 +// | origin 2 | +// +--------------------+ 0x110000000000 +// | invalid | +// +--------------------+ 0x100000000000 +// | shadow 2 | +// +--------------------+ 0x010000000000 +// | application 1 | +// +--------------------+ 0x000000000000 +// +// MEM_TO_SHADOW(mem) = mem ^ 0x500000000000 +// SHADOW_TO_ORIGIN(shadow) = shadow + 0x100000000000 + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE +dfsan_label __dfsan_union_load(const dfsan_label *ls, uptr n) { + dfsan_label label = ls[0]; + for (uptr i = 1; i != n; ++i) + label |= ls[i]; + return label; +} + +// Return the union of all the n labels from addr at the high 32 bit, and the +// origin of the first taint byte at the low 32 bit. +extern "C" SANITIZER_INTERFACE_ATTRIBUTE u64 +__dfsan_load_label_and_origin(const void *addr, uptr n) { + dfsan_label label = 0; + u64 ret = 0; + uptr p = (uptr)addr; + dfsan_label *s = shadow_for((void *)p); + for (uptr i = 0; i < n; ++i) { + dfsan_label l = s[i]; + if (!l) + continue; + label |= l; + if (!ret) + ret = *(dfsan_origin *)origin_for((void *)(p + i)); + } + return ret | (u64)label << 32; +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE +void __dfsan_unimplemented(char *fname) { + if (flags().warn_unimplemented) + Report("WARNING: DataFlowSanitizer: call to uninstrumented function %s\n", + fname); +} + +// Use '-mllvm -dfsan-debug-nonzero-labels' and break on this function +// to try to figure out where labels are being introduced in a nominally +// label-free program. +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_nonzero_label() { + if (flags().warn_nonzero_labels) + Report("WARNING: DataFlowSanitizer: saw nonzero label\n"); +} + +// Indirect call to an uninstrumented vararg function. We don't have a way of +// handling these at the moment. +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void +__dfsan_vararg_wrapper(const char *fname) { + Report("FATAL: DataFlowSanitizer: unsupported indirect call to vararg " + "function %s\n", fname); + Die(); +} + +// Resolves the union of two labels. +SANITIZER_INTERFACE_ATTRIBUTE dfsan_label +dfsan_union(dfsan_label l1, dfsan_label l2) { + return l1 | l2; +} + +static const uptr kOriginAlign = sizeof(dfsan_origin); +static const uptr kOriginAlignMask = ~(kOriginAlign - 1UL); + +static uptr OriginAlignUp(uptr u) { + return (u + kOriginAlign - 1) & kOriginAlignMask; +} + +static uptr OriginAlignDown(uptr u) { return u & kOriginAlignMask; } + +// Return the origin of the first taint byte in the size bytes from the address +// addr. +static dfsan_origin GetOriginIfTainted(uptr addr, uptr size) { + for (uptr i = 0; i < size; ++i, ++addr) { + dfsan_label *s = shadow_for((void *)addr); + + if (*s) { + // Validate address region. + CHECK(MEM_IS_SHADOW(s)); + return *(dfsan_origin *)origin_for((void *)addr); + } + } + return 0; +} + +// For platforms which support slow unwinder only, we need to restrict the store +// context size to 1, basically only storing the current pc, because the slow +// unwinder which is based on libunwind is not async signal safe and causes +// random freezes in forking applications as well as in signal handlers. +// DFSan supports only Linux. So we do not restrict the store context size. +#define GET_STORE_STACK_TRACE_PC_BP(pc, bp) \ + BufferedStackTrace stack; \ + stack.Unwind(pc, bp, nullptr, true, flags().store_context_size); + +#define PRINT_CALLER_STACK_TRACE \ + { \ + GET_CALLER_PC_BP_SP; \ + (void)sp; \ + GET_STORE_STACK_TRACE_PC_BP(pc, bp) \ + stack.Print(); \ + } + +// Return a chain with the previous ID id and the current stack. +// from_init = true if this is the first chain of an origin tracking path. +static u32 ChainOrigin(u32 id, StackTrace *stack, bool from_init = false) { + // StackDepot is not async signal safe. Do not create new chains in a signal + // handler. + DFsanThread *t = GetCurrentThread(); + if (t && t->InSignalHandler()) + return id; + + // As an optimization the origin of an application byte is updated only when + // its shadow is non-zero. Because we are only interested in the origins of + // taint labels, it does not matter what origin a zero label has. This reduces + // memory write cost. MSan does similar optimization. The following invariant + // may not hold because of some bugs. We check the invariant to help debug. + if (!from_init && id == 0 && flags().check_origin_invariant) { + Printf(" DFSan found invalid origin invariant\n"); + PRINT_CALLER_STACK_TRACE + } + + Origin o = Origin::FromRawId(id); + stack->tag = StackTrace::TAG_UNKNOWN; + Origin chained = Origin::CreateChainedOrigin(o, stack); + return chained.raw_id(); +} + +static void ChainAndWriteOriginIfTainted(uptr src, uptr size, uptr dst, + StackTrace *stack) { + dfsan_origin o = GetOriginIfTainted(src, size); + if (o) { + o = ChainOrigin(o, stack); + *(dfsan_origin *)origin_for((void *)dst) = o; + } +} + +// Copy the origins of the size bytes from src to dst. The source and target +// memory ranges cannot be overlapped. This is used by memcpy. stack records the +// stack trace of the memcpy. When dst and src are not 4-byte aligned properly, +// origins at the unaligned address boundaries may be overwritten because four +// contiguous bytes share the same origin. +static void CopyOrigin(const void *dst, const void *src, uptr size, + StackTrace *stack) { + uptr d = (uptr)dst; + uptr beg = OriginAlignDown(d); + // Copy left unaligned origin if that memory is tainted. + if (beg < d) { + ChainAndWriteOriginIfTainted((uptr)src, beg + kOriginAlign - d, beg, stack); + beg += kOriginAlign; + } + + uptr end = OriginAlignDown(d + size); + // If both ends fall into the same 4-byte slot, we are done. + if (end < beg) + return; + + // Copy right unaligned origin if that memory is tainted. + if (end < d + size) + ChainAndWriteOriginIfTainted((uptr)src + (end - d), (d + size) - end, end, + stack); + + if (beg >= end) + return; + + // Align src up. + uptr src_a = OriginAlignUp((uptr)src); + dfsan_origin *src_o = origin_for((void *)src_a); + u32 *src_s = (u32 *)shadow_for((void *)src_a); + dfsan_origin *src_end = origin_for((void *)(src_a + (end - beg))); + dfsan_origin *dst_o = origin_for((void *)beg); + dfsan_origin last_src_o = 0; + dfsan_origin last_dst_o = 0; + for (; src_o < src_end; ++src_o, ++src_s, ++dst_o) { + if (!*src_s) + continue; + if (*src_o != last_src_o) { + last_src_o = *src_o; + last_dst_o = ChainOrigin(last_src_o, stack); + } + *dst_o = last_dst_o; + } +} + +// Copy the origins of the size bytes from src to dst. The source and target +// memory ranges may be overlapped. So the copy is done in a reverse order. +// This is used by memmove. stack records the stack trace of the memmove. +static void ReverseCopyOrigin(const void *dst, const void *src, uptr size, + StackTrace *stack) { + uptr d = (uptr)dst; + uptr end = OriginAlignDown(d + size); + + // Copy right unaligned origin if that memory is tainted. + if (end < d + size) + ChainAndWriteOriginIfTainted((uptr)src + (end - d), (d + size) - end, end, + stack); + + uptr beg = OriginAlignDown(d); + + if (beg + kOriginAlign < end) { + // Align src up. + uptr src_a = OriginAlignUp((uptr)src); + void *src_end = (void *)(src_a + end - beg - kOriginAlign); + dfsan_origin *src_end_o = origin_for(src_end); + u32 *src_end_s = (u32 *)shadow_for(src_end); + dfsan_origin *src_begin_o = origin_for((void *)src_a); + dfsan_origin *dst = origin_for((void *)(end - kOriginAlign)); + dfsan_origin last_src_o = 0; + dfsan_origin last_dst_o = 0; + for (; src_end_o >= src_begin_o; --src_end_o, --src_end_s, --dst) { + if (!*src_end_s) + continue; + if (*src_end_o != last_src_o) { + last_src_o = *src_end_o; + last_dst_o = ChainOrigin(last_src_o, stack); + } + *dst = last_dst_o; + } + } + + // Copy left unaligned origin if that memory is tainted. + if (beg < d) + ChainAndWriteOriginIfTainted((uptr)src, beg + kOriginAlign - d, beg, stack); +} + +// Copy or move the origins of the len bytes from src to dst. The source and +// target memory ranges may or may not be overlapped. This is used by memory +// transfer operations. stack records the stack trace of the memory transfer +// operation. +static void MoveOrigin(const void *dst, const void *src, uptr size, + StackTrace *stack) { + // Validate address regions. + if (!MEM_IS_SHADOW(shadow_for(dst)) || + !MEM_IS_SHADOW(shadow_for((void *)((uptr)dst + size))) || + !MEM_IS_SHADOW(shadow_for(src)) || + !MEM_IS_SHADOW(shadow_for((void *)((uptr)src + size)))) { + CHECK(false); + return; + } + // If destination origin range overlaps with source origin range, move + // origins by copying origins in a reverse order; otherwise, copy origins in + // a normal order. The orders of origin transfer are consistent with the + // orders of how memcpy and memmove transfer user data. + uptr src_aligned_beg = OriginAlignDown((uptr)src); + uptr src_aligned_end = OriginAlignDown((uptr)src + size); + uptr dst_aligned_beg = OriginAlignDown((uptr)dst); + if (dst_aligned_beg < src_aligned_end && dst_aligned_beg >= src_aligned_beg) + return ReverseCopyOrigin(dst, src, size, stack); + return CopyOrigin(dst, src, size, stack); +} + +// Set the size bytes from the addres dst to be the origin value. +static void SetOrigin(const void *dst, uptr size, u32 origin) { + if (size == 0) + return; + + // Origin mapping is 4 bytes per 4 bytes of application memory. + // Here we extend the range such that its left and right bounds are both + // 4 byte aligned. + uptr x = unaligned_origin_for((uptr)dst); + uptr beg = OriginAlignDown(x); + uptr end = OriginAlignUp(x + size); // align up. + u64 origin64 = ((u64)origin << 32) | origin; + // This is like memset, but the value is 32-bit. We unroll by 2 to write + // 64 bits at once. May want to unroll further to get 128-bit stores. + if (beg & 7ULL) { + if (*(u32 *)beg != origin) + *(u32 *)beg = origin; + beg += 4; + } + for (uptr addr = beg; addr < (end & ~7UL); addr += 8) { + if (*(u64 *)addr == origin64) + continue; + *(u64 *)addr = origin64; + } + if (end & 7ULL) + if (*(u32 *)(end - kOriginAlign) != origin) + *(u32 *)(end - kOriginAlign) = origin; +} + +#define RET_CHAIN_ORIGIN(id) \ + GET_CALLER_PC_BP_SP; \ + (void)sp; \ + GET_STORE_STACK_TRACE_PC_BP(pc, bp); \ + return ChainOrigin(id, &stack); + +// Return a new origin chain with the previous ID id and the current stack +// trace. +extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_origin +__dfsan_chain_origin(dfsan_origin id) { + RET_CHAIN_ORIGIN(id) +} + +// Return a new origin chain with the previous ID id and the current stack +// trace if the label is tainted. +extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_origin +__dfsan_chain_origin_if_tainted(dfsan_label label, dfsan_origin id) { + if (!label) + return id; + RET_CHAIN_ORIGIN(id) +} + +// Copy or move the origins of the len bytes from src to dst. +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_mem_origin_transfer( + const void *dst, const void *src, uptr len) { + if (src == dst) + return; + GET_CALLER_PC_BP; + GET_STORE_STACK_TRACE_PC_BP(pc, bp); + MoveOrigin(dst, src, len, &stack); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void dfsan_mem_origin_transfer( + const void *dst, const void *src, uptr len) { + __dfsan_mem_origin_transfer(dst, src, len); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void dfsan_mem_shadow_transfer( + void *dst, const void *src, uptr len) { + internal_memcpy((void *)__dfsan::shadow_for(dst), + (const void *)__dfsan::shadow_for(src), + len * sizeof(dfsan_label)); +} + +namespace __dfsan { + +bool dfsan_inited = false; +bool dfsan_init_is_running = false; + +void dfsan_copy_memory(void *dst, const void *src, uptr size) { + internal_memcpy(dst, src, size); + dfsan_mem_shadow_transfer(dst, src, size); + if (dfsan_get_track_origins()) + dfsan_mem_origin_transfer(dst, src, size); +} + +// Releases the pages within the origin address range. +static void ReleaseOrigins(void *addr, uptr size) { + const uptr beg_origin_addr = (uptr)__dfsan::origin_for(addr); + const void *end_addr = (void *)((uptr)addr + size); + const uptr end_origin_addr = (uptr)__dfsan::origin_for(end_addr); + + if (end_origin_addr - beg_origin_addr < + common_flags()->clear_shadow_mmap_threshold) + return; + + const uptr page_size = GetPageSizeCached(); + const uptr beg_aligned = RoundUpTo(beg_origin_addr, page_size); + const uptr end_aligned = RoundDownTo(end_origin_addr, page_size); + + if (!MmapFixedSuperNoReserve(beg_aligned, end_aligned - beg_aligned)) + Die(); +} + +static void WriteZeroShadowInRange(uptr beg, uptr end) { + // Don't write the label if it is already the value we need it to be. + // In a program where most addresses are not labeled, it is common that + // a page of shadow memory is entirely zeroed. The Linux copy-on-write + // implementation will share all of the zeroed pages, making a copy of a + // page when any value is written. The un-sharing will happen even if + // the value written does not change the value in memory. Avoiding the + // write when both |label| and |*labelp| are zero dramatically reduces + // the amount of real memory used by large programs. + if (!mem_is_zero((const char *)beg, end - beg)) + internal_memset((void *)beg, 0, end - beg); +} + +// Releases the pages within the shadow address range, and sets +// the shadow addresses not on the pages to be 0. +static void ReleaseOrClearShadows(void *addr, uptr size) { + const uptr beg_shadow_addr = (uptr)__dfsan::shadow_for(addr); + const void *end_addr = (void *)((uptr)addr + size); + const uptr end_shadow_addr = (uptr)__dfsan::shadow_for(end_addr); + + if (end_shadow_addr - beg_shadow_addr < + common_flags()->clear_shadow_mmap_threshold) { + WriteZeroShadowInRange(beg_shadow_addr, end_shadow_addr); + return; + } + + const uptr page_size = GetPageSizeCached(); + const uptr beg_aligned = RoundUpTo(beg_shadow_addr, page_size); + const uptr end_aligned = RoundDownTo(end_shadow_addr, page_size); + + if (beg_aligned >= end_aligned) { + WriteZeroShadowInRange(beg_shadow_addr, end_shadow_addr); + } else { + if (beg_aligned != beg_shadow_addr) + WriteZeroShadowInRange(beg_shadow_addr, beg_aligned); + if (end_aligned != end_shadow_addr) + WriteZeroShadowInRange(end_aligned, end_shadow_addr); + if (!MmapFixedSuperNoReserve(beg_aligned, end_aligned - beg_aligned)) + Die(); + } +} + +void SetShadow(dfsan_label label, void *addr, uptr size, dfsan_origin origin) { + if (0 != label) { + const uptr beg_shadow_addr = (uptr)__dfsan::shadow_for(addr); + internal_memset((void *)beg_shadow_addr, label, size); + if (dfsan_get_track_origins()) + SetOrigin(addr, size, origin); + return; + } + + if (dfsan_get_track_origins()) + ReleaseOrigins(addr, size); + + ReleaseOrClearShadows(addr, size); +} + +} // namespace __dfsan + +// If the label s is tainted, set the size bytes from the address p to be a new +// origin chain with the previous ID o and the current stack trace. This is +// used by instrumentation to reduce code size when too much code is inserted. +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_maybe_store_origin( + dfsan_label s, void *p, uptr size, dfsan_origin o) { + if (UNLIKELY(s)) { + GET_CALLER_PC_BP_SP; + (void)sp; + GET_STORE_STACK_TRACE_PC_BP(pc, bp); + SetOrigin(p, size, ChainOrigin(o, &stack)); + } +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_set_label( + dfsan_label label, dfsan_origin origin, void *addr, uptr size) { + __dfsan::SetShadow(label, addr, size, origin); +} + +SANITIZER_INTERFACE_ATTRIBUTE +void dfsan_set_label(dfsan_label label, void *addr, uptr size) { + dfsan_origin init_origin = 0; + if (label && dfsan_get_track_origins()) { + GET_CALLER_PC_BP; + GET_STORE_STACK_TRACE_PC_BP(pc, bp); + init_origin = ChainOrigin(0, &stack, true); + } + __dfsan::SetShadow(label, addr, size, init_origin); +} + +SANITIZER_INTERFACE_ATTRIBUTE +void dfsan_add_label(dfsan_label label, void *addr, uptr size) { + if (0 == label) + return; + + if (dfsan_get_track_origins()) { + GET_CALLER_PC_BP; + GET_STORE_STACK_TRACE_PC_BP(pc, bp); + dfsan_origin init_origin = ChainOrigin(0, &stack, true); + SetOrigin(addr, size, init_origin); + } + + for (dfsan_label *labelp = shadow_for(addr); size != 0; --size, ++labelp) + *labelp |= label; +} + +// Unlike the other dfsan interface functions the behavior of this function +// depends on the label of one of its arguments. Hence it is implemented as a +// custom function. +extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_label +__dfsw_dfsan_get_label(long data, dfsan_label data_label, + dfsan_label *ret_label) { + *ret_label = 0; + return data_label; +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_label __dfso_dfsan_get_label( + long data, dfsan_label data_label, dfsan_label *ret_label, + dfsan_origin data_origin, dfsan_origin *ret_origin) { + *ret_label = 0; + *ret_origin = 0; + return data_label; +} + +// This function is used if dfsan_get_origin is called when origin tracking is +// off. +extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_origin __dfsw_dfsan_get_origin( + long data, dfsan_label data_label, dfsan_label *ret_label) { + *ret_label = 0; + return 0; +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_origin __dfso_dfsan_get_origin( + long data, dfsan_label data_label, dfsan_label *ret_label, + dfsan_origin data_origin, dfsan_origin *ret_origin) { + *ret_label = 0; + *ret_origin = 0; + return data_origin; +} + +SANITIZER_INTERFACE_ATTRIBUTE dfsan_label +dfsan_read_label(const void *addr, uptr size) { + if (size == 0) + return 0; + return __dfsan_union_load(shadow_for(addr), size); +} + +SANITIZER_INTERFACE_ATTRIBUTE dfsan_origin +dfsan_read_origin_of_first_taint(const void *addr, uptr size) { + return GetOriginIfTainted((uptr)addr, size); +} + +SANITIZER_INTERFACE_ATTRIBUTE void dfsan_set_label_origin(dfsan_label label, + dfsan_origin origin, + void *addr, + uptr size) { + __dfsan_set_label(label, origin, addr, size); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE int +dfsan_has_label(dfsan_label label, dfsan_label elem) { + return (label & elem) == elem; +} + +namespace __dfsan { + +typedef void (*dfsan_conditional_callback_t)(dfsan_label label, + dfsan_origin origin); +static dfsan_conditional_callback_t conditional_callback = nullptr; +static dfsan_label labels_in_signal_conditional = 0; + +static void ConditionalCallback(dfsan_label label, dfsan_origin origin) { + // Programs have many branches. For efficiency the conditional sink callback + // handler needs to ignore as many as possible as early as possible. + if (label == 0) { + return; + } + if (conditional_callback == nullptr) { + return; + } + + // This initial ConditionalCallback handler needs to be in here in dfsan + // runtime (rather than being an entirely user implemented hook) so that it + // has access to dfsan thread information. + DFsanThread *t = GetCurrentThread(); + // A callback operation which does useful work (like record the flow) will + // likely be too long executed in a signal handler. + if (t && t->InSignalHandler()) { + // Record set of labels used in signal handler for completeness. + labels_in_signal_conditional |= label; + return; + } + + conditional_callback(label, origin); +} + +} // namespace __dfsan + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void +__dfsan_conditional_callback_origin(dfsan_label label, dfsan_origin origin) { + __dfsan::ConditionalCallback(label, origin); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __dfsan_conditional_callback( + dfsan_label label) { + __dfsan::ConditionalCallback(label, 0); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void dfsan_set_conditional_callback( + __dfsan::dfsan_conditional_callback_t callback) { + __dfsan::conditional_callback = callback; +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_label +dfsan_get_labels_in_signal_conditional() { + return __dfsan::labels_in_signal_conditional; +} + +class Decorator : public __sanitizer::SanitizerCommonDecorator { + public: + Decorator() : SanitizerCommonDecorator() {} + const char *Origin() const { return Magenta(); } +}; + +namespace { + +void PrintNoOriginTrackingWarning() { + Decorator d; + Printf( + " %sDFSan: origin tracking is not enabled. Did you specify the " + "-dfsan-track-origins=1 option?%s\n", + d.Warning(), d.Default()); +} + +void PrintNoTaintWarning(const void *address) { + Decorator d; + Printf(" %sDFSan: no tainted value at %x%s\n", d.Warning(), address, + d.Default()); +} + +void PrintInvalidOriginWarning(dfsan_label label, const void *address) { + Decorator d; + Printf( + " %sTaint value 0x%x (at %p) has invalid origin tracking. This can " + "be a DFSan bug.%s\n", + d.Warning(), label, address, d.Default()); +} + +void PrintInvalidOriginIdWarning(dfsan_origin origin) { + Decorator d; + Printf( + " %sOrigin Id %d has invalid origin tracking. This can " + "be a DFSan bug.%s\n", + d.Warning(), origin, d.Default()); +} + +bool PrintOriginTraceFramesToStr(Origin o, InternalScopedString *out) { + Decorator d; + bool found = false; + + while (o.isChainedOrigin()) { + StackTrace stack; + dfsan_origin origin_id = o.raw_id(); + o = o.getNextChainedOrigin(&stack); + if (o.isChainedOrigin()) + out->append( + " %sOrigin value: 0x%x, Taint value was stored to memory at%s\n", + d.Origin(), origin_id, d.Default()); + else + out->append(" %sOrigin value: 0x%x, Taint value was created at%s\n", + d.Origin(), origin_id, d.Default()); + + // Includes a trailing newline, so no need to add it again. + stack.PrintTo(out); + found = true; + } + + return found; +} + +bool PrintOriginTraceToStr(const void *addr, const char *description, + InternalScopedString *out) { + CHECK(out); + CHECK(dfsan_get_track_origins()); + Decorator d; + + const dfsan_label label = *__dfsan::shadow_for(addr); + CHECK(label); + + const dfsan_origin origin = *__dfsan::origin_for(addr); + + out->append(" %sTaint value 0x%x (at %p) origin tracking (%s)%s\n", + d.Origin(), label, addr, description ? description : "", + d.Default()); + + Origin o = Origin::FromRawId(origin); + return PrintOriginTraceFramesToStr(o, out); +} + +} // namespace + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void dfsan_print_origin_trace( + const void *addr, const char *description) { + if (!dfsan_get_track_origins()) { + PrintNoOriginTrackingWarning(); + return; + } + + const dfsan_label label = *__dfsan::shadow_for(addr); + if (!label) { + PrintNoTaintWarning(addr); + return; + } + + InternalScopedString trace; + bool success = PrintOriginTraceToStr(addr, description, &trace); + + if (trace.length()) + Printf("%s", trace.data()); + + if (!success) + PrintInvalidOriginWarning(label, addr); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE uptr +dfsan_sprint_origin_trace(const void *addr, const char *description, + char *out_buf, uptr out_buf_size) { + CHECK(out_buf); + + if (!dfsan_get_track_origins()) { + PrintNoOriginTrackingWarning(); + return 0; + } + + const dfsan_label label = *__dfsan::shadow_for(addr); + if (!label) { + PrintNoTaintWarning(addr); + return 0; + } + + InternalScopedString trace; + bool success = PrintOriginTraceToStr(addr, description, &trace); + + if (!success) { + PrintInvalidOriginWarning(label, addr); + return 0; + } + + if (out_buf_size) { + internal_strncpy(out_buf, trace.data(), out_buf_size - 1); + out_buf[out_buf_size - 1] = '\0'; + } + + return trace.length(); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void dfsan_print_origin_id_trace( + dfsan_origin origin) { + if (!dfsan_get_track_origins()) { + PrintNoOriginTrackingWarning(); + return; + } + Origin o = Origin::FromRawId(origin); + + InternalScopedString trace; + bool success = PrintOriginTraceFramesToStr(o, &trace); + + if (trace.length()) + Printf("%s", trace.data()); + + if (!success) + PrintInvalidOriginIdWarning(origin); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE uptr dfsan_sprint_origin_id_trace( + dfsan_origin origin, char *out_buf, uptr out_buf_size) { + CHECK(out_buf); + + if (!dfsan_get_track_origins()) { + PrintNoOriginTrackingWarning(); + return 0; + } + Origin o = Origin::FromRawId(origin); + + InternalScopedString trace; + bool success = PrintOriginTraceFramesToStr(o, &trace); + + if (!success) { + PrintInvalidOriginIdWarning(origin); + return 0; + } + + if (out_buf_size) { + internal_strncpy(out_buf, trace.data(), out_buf_size - 1); + out_buf[out_buf_size - 1] = '\0'; + } + + return trace.length(); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE dfsan_origin +dfsan_get_init_origin(const void *addr) { + if (!dfsan_get_track_origins()) + return 0; + + const dfsan_label label = *__dfsan::shadow_for(addr); + if (!label) + return 0; + + const dfsan_origin origin = *__dfsan::origin_for(addr); + + Origin o = Origin::FromRawId(origin); + dfsan_origin origin_id = o.raw_id(); + while (o.isChainedOrigin()) { + StackTrace stack; + origin_id = o.raw_id(); + o = o.getNextChainedOrigin(&stack); + } + return origin_id; +} + +void __sanitizer::BufferedStackTrace::UnwindImpl(uptr pc, uptr bp, + void *context, + bool request_fast, + u32 max_depth) { + using namespace __dfsan; + DFsanThread *t = GetCurrentThread(); + if (!t || !StackTrace::WillUseFastUnwind(request_fast)) { + return Unwind(max_depth, pc, bp, context, 0, 0, false); + } + Unwind(max_depth, pc, bp, nullptr, t->stack_top(), t->stack_bottom(), true); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __sanitizer_print_stack_trace() { + GET_CALLER_PC_BP; + GET_STORE_STACK_TRACE_PC_BP(pc, bp); + stack.Print(); +} + +extern "C" SANITIZER_INTERFACE_ATTRIBUTE uptr +dfsan_sprint_stack_trace(char *out_buf, uptr out_buf_size) { + CHECK(out_buf); + GET_CALLER_PC_BP; + GET_STORE_STACK_TRACE_PC_BP(pc, bp); + return stack.PrintTo(out_buf, out_buf_size); +} + +void Flags::SetDefaults() { +#define DFSAN_FLAG(Type, Name, DefaultValue, Description) Name = DefaultValue; +#include "dfsan_flags.inc" +#undef DFSAN_FLAG +} + +static void RegisterDfsanFlags(FlagParser *parser, Flags *f) { +#define DFSAN_FLAG(Type, Name, DefaultValue, Description) \ + RegisterFlag(parser, #Name, Description, &f->Name); +#include "dfsan_flags.inc" +#undef DFSAN_FLAG +} + +static void InitializeFlags() { + SetCommonFlagsDefaults(); + { + CommonFlags cf; + cf.CopyFrom(*common_flags()); + cf.intercept_tls_get_addr = true; + OverrideCommonFlags(cf); + } + flags().SetDefaults(); + + FlagParser parser; + RegisterCommonFlags(&parser); + RegisterDfsanFlags(&parser, &flags()); + parser.ParseStringFromEnv("DFSAN_OPTIONS"); + InitializeCommonFlags(); + if (Verbosity()) ReportUnrecognizedFlags(); + if (common_flags()->help) parser.PrintFlagDescriptions(); +} + +SANITIZER_INTERFACE_ATTRIBUTE +void dfsan_clear_arg_tls(uptr offset, uptr size) { + internal_memset((void *)((uptr)__dfsan_arg_tls + offset), 0, size); +} + +SANITIZER_INTERFACE_ATTRIBUTE +void dfsan_clear_thread_local_state() { + internal_memset(__dfsan_arg_tls, 0, sizeof(__dfsan_arg_tls)); + internal_memset(__dfsan_retval_tls, 0, sizeof(__dfsan_retval_tls)); + + if (dfsan_get_track_origins()) { + internal_memset(__dfsan_arg_origin_tls, 0, sizeof(__dfsan_arg_origin_tls)); + internal_memset(&__dfsan_retval_origin_tls, 0, + sizeof(__dfsan_retval_origin_tls)); + } +} + +extern "C" void dfsan_flush() { + const uptr maxVirtualAddress = GetMaxUserVirtualAddress(); + for (unsigned i = 0; i < kMemoryLayoutSize; ++i) { + uptr start = kMemoryLayout[i].start; + uptr end = kMemoryLayout[i].end; + uptr size = end - start; + MappingDesc::Type type = kMemoryLayout[i].type; + + if (type != MappingDesc::SHADOW && type != MappingDesc::ORIGIN) + continue; + + // Check if the segment should be mapped based on platform constraints. + if (start >= maxVirtualAddress) + continue; + + if (!MmapFixedSuperNoReserve(start, size, kMemoryLayout[i].name)) { + Printf("FATAL: DataFlowSanitizer: failed to clear memory region\n"); + Die(); + } + } + __dfsan::labels_in_signal_conditional = 0; +} + +// TODO: CheckMemoryLayoutSanity is based on msan. +// Consider refactoring these into a shared implementation. +static void CheckMemoryLayoutSanity() { + uptr prev_end = 0; + for (unsigned i = 0; i < kMemoryLayoutSize; ++i) { + uptr start = kMemoryLayout[i].start; + uptr end = kMemoryLayout[i].end; + MappingDesc::Type type = kMemoryLayout[i].type; + CHECK_LT(start, end); + CHECK_EQ(prev_end, start); + CHECK(addr_is_type(start, type)); + CHECK(addr_is_type((start + end) / 2, type)); + CHECK(addr_is_type(end - 1, type)); + if (type == MappingDesc::APP) { + uptr addr = start; + CHECK(MEM_IS_SHADOW(MEM_TO_SHADOW(addr))); + CHECK(MEM_IS_ORIGIN(MEM_TO_ORIGIN(addr))); + CHECK_EQ(MEM_TO_ORIGIN(addr), SHADOW_TO_ORIGIN(MEM_TO_SHADOW(addr))); + + addr = (start + end) / 2; + CHECK(MEM_IS_SHADOW(MEM_TO_SHADOW(addr))); + CHECK(MEM_IS_ORIGIN(MEM_TO_ORIGIN(addr))); + CHECK_EQ(MEM_TO_ORIGIN(addr), SHADOW_TO_ORIGIN(MEM_TO_SHADOW(addr))); + + addr = end - 1; + CHECK(MEM_IS_SHADOW(MEM_TO_SHADOW(addr))); + CHECK(MEM_IS_ORIGIN(MEM_TO_ORIGIN(addr))); + CHECK_EQ(MEM_TO_ORIGIN(addr), SHADOW_TO_ORIGIN(MEM_TO_SHADOW(addr))); + } + prev_end = end; + } +} + +// TODO: CheckMemoryRangeAvailability is based on msan. +// Consider refactoring these into a shared implementation. +static bool CheckMemoryRangeAvailability(uptr beg, uptr size) { + if (size > 0) { + uptr end = beg + size - 1; + if (!MemoryRangeIsAvailable(beg, end)) { + Printf("FATAL: Memory range %p - %p is not available.\n", beg, end); + return false; + } + } + return true; +} + +// TODO: ProtectMemoryRange is based on msan. +// Consider refactoring these into a shared implementation. +static bool ProtectMemoryRange(uptr beg, uptr size, const char *name) { + if (size > 0) { + void *addr = MmapFixedNoAccess(beg, size, name); + if (beg == 0 && addr) { + // Depending on the kernel configuration, we may not be able to protect + // the page at address zero. + uptr gap = 16 * GetPageSizeCached(); + beg += gap; + size -= gap; + addr = MmapFixedNoAccess(beg, size, name); + } + if ((uptr)addr != beg) { + uptr end = beg + size - 1; + Printf("FATAL: Cannot protect memory range %p - %p (%s).\n", beg, end, + name); + return false; + } + } + return true; +} + +// TODO: InitShadow is based on msan. +// Consider refactoring these into a shared implementation. +bool InitShadow(bool init_origins) { + // Let user know mapping parameters first. + VPrintf(1, "dfsan_init %p\n", (void *)&__dfsan::dfsan_init); + for (unsigned i = 0; i < kMemoryLayoutSize; ++i) + VPrintf(1, "%s: %zx - %zx\n", kMemoryLayout[i].name, kMemoryLayout[i].start, + kMemoryLayout[i].end - 1); + + CheckMemoryLayoutSanity(); + + if (!MEM_IS_APP(&__dfsan::dfsan_init)) { + Printf("FATAL: Code %p is out of application range. Non-PIE build?\n", + (uptr)&__dfsan::dfsan_init); + return false; + } + + const uptr maxVirtualAddress = GetMaxUserVirtualAddress(); + + for (unsigned i = 0; i < kMemoryLayoutSize; ++i) { + uptr start = kMemoryLayout[i].start; + uptr end = kMemoryLayout[i].end; + uptr size = end - start; + MappingDesc::Type type = kMemoryLayout[i].type; + + // Check if the segment should be mapped based on platform constraints. + if (start >= maxVirtualAddress) + continue; + + bool map = type == MappingDesc::SHADOW || + (init_origins && type == MappingDesc::ORIGIN); + bool protect = type == MappingDesc::INVALID || + (!init_origins && type == MappingDesc::ORIGIN); + CHECK(!(map && protect)); + if (!map && !protect) + CHECK(type == MappingDesc::APP); + if (map) { + if (!CheckMemoryRangeAvailability(start, size)) + return false; + if (!MmapFixedSuperNoReserve(start, size, kMemoryLayout[i].name)) + return false; + if (common_flags()->use_madv_dontdump) + DontDumpShadowMemory(start, size); + } + if (protect) { + if (!CheckMemoryRangeAvailability(start, size)) + return false; + if (!ProtectMemoryRange(start, size, kMemoryLayout[i].name)) + return false; + } + } + + return true; +} + +static void DFsanInit(int argc, char **argv, char **envp) { + CHECK(!dfsan_init_is_running); + if (dfsan_inited) + return; + dfsan_init_is_running = true; + SanitizerToolName = "DataflowSanitizer"; + + AvoidCVE_2016_2143(); + + InitializeFlags(); + + CheckASLR(); + + InitShadow(dfsan_get_track_origins()); + + initialize_interceptors(); + + // Set up threads + DFsanTSDInit(DFsanTSDDtor); + + dfsan_allocator_init(); + + DFsanThread *main_thread = DFsanThread::Create(nullptr, nullptr, nullptr); + SetCurrentThread(main_thread); + main_thread->Init(); + + dfsan_init_is_running = false; + dfsan_inited = true; +} + +namespace __dfsan { + +void dfsan_init() { DFsanInit(0, nullptr, nullptr); } + +} // namespace __dfsan + +#if SANITIZER_CAN_USE_PREINIT_ARRAY +__attribute__((section(".preinit_array"), + used)) static void (*dfsan_init_ptr)(int, char **, + char **) = DFsanInit; +#endif diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan.h b/contrib/libs/clang14-rt/lib/dfsan/dfsan.h new file mode 100644 index 0000000000..f3403dd44f --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan.h @@ -0,0 +1,106 @@ +//===-- dfsan.h -------------------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +// Private DFSan header. +//===----------------------------------------------------------------------===// + +#ifndef DFSAN_H +#define DFSAN_H + +#include "sanitizer_common/sanitizer_internal_defs.h" + +#include "dfsan_platform.h" + +using __sanitizer::u32; +using __sanitizer::u8; +using __sanitizer::uptr; + +// Copy declarations from public sanitizer/dfsan_interface.h header here. +typedef u8 dfsan_label; +typedef u32 dfsan_origin; + +extern "C" { +void dfsan_add_label(dfsan_label label, void *addr, uptr size); +void dfsan_set_label(dfsan_label label, void *addr, uptr size); +dfsan_label dfsan_read_label(const void *addr, uptr size); +dfsan_label dfsan_union(dfsan_label l1, dfsan_label l2); +// Zero out [offset, offset+size) from __dfsan_arg_tls. +void dfsan_clear_arg_tls(uptr offset, uptr size); +// Zero out the TLS storage. +void dfsan_clear_thread_local_state(); + +// Return the origin associated with the first taint byte in the size bytes +// from the address addr. +dfsan_origin dfsan_read_origin_of_first_taint(const void *addr, uptr size); + +// Set the data within [addr, addr+size) with label and origin. +void dfsan_set_label_origin(dfsan_label label, dfsan_origin origin, void *addr, + uptr size); + +// Copy or move the origins of the len bytes from src to dst. +void dfsan_mem_origin_transfer(const void *dst, const void *src, uptr len); + +// Copy shadow bytes from src to dst. +// Note this preserves distinct taint labels at specific offsets. +void dfsan_mem_shadow_transfer(void *dst, const void *src, uptr len); +} // extern "C" + +template <typename T> +void dfsan_set_label(dfsan_label label, T &data) { + dfsan_set_label(label, (void *)&data, sizeof(T)); +} + +namespace __dfsan { + +extern bool dfsan_inited; +extern bool dfsan_init_is_running; + +void initialize_interceptors(); + +inline dfsan_label *shadow_for(void *ptr) { + return (dfsan_label *)MEM_TO_SHADOW(ptr); +} + +inline const dfsan_label *shadow_for(const void *ptr) { + return shadow_for(const_cast<void *>(ptr)); +} + +inline uptr unaligned_origin_for(uptr ptr) { return MEM_TO_ORIGIN(ptr); } + +inline dfsan_origin *origin_for(void *ptr) { + auto aligned_addr = unaligned_origin_for(reinterpret_cast<uptr>(ptr)) & + ~(sizeof(dfsan_origin) - 1); + return reinterpret_cast<dfsan_origin *>(aligned_addr); +} + +inline const dfsan_origin *origin_for(const void *ptr) { + return origin_for(const_cast<void *>(ptr)); +} + +void dfsan_copy_memory(void *dst, const void *src, uptr size); + +void dfsan_allocator_init(); +void dfsan_deallocate(void *ptr); + +void *dfsan_malloc(uptr size); +void *dfsan_calloc(uptr nmemb, uptr size); +void *dfsan_realloc(void *ptr, uptr size); +void *dfsan_reallocarray(void *ptr, uptr nmemb, uptr size); +void *dfsan_valloc(uptr size); +void *dfsan_pvalloc(uptr size); +void *dfsan_aligned_alloc(uptr alignment, uptr size); +void *dfsan_memalign(uptr alignment, uptr size); +int dfsan_posix_memalign(void **memptr, uptr alignment, uptr size); + +void dfsan_init(); + +} // namespace __dfsan + +#endif // DFSAN_H diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_allocator.cpp b/contrib/libs/clang14-rt/lib/dfsan/dfsan_allocator.cpp new file mode 100644 index 0000000000..c50aee7a55 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_allocator.cpp @@ -0,0 +1,293 @@ +//===-- dfsan_allocator.cpp -------------------------- --------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataflowSanitizer. +// +// DataflowSanitizer allocator. +//===----------------------------------------------------------------------===// + +#include "dfsan_allocator.h" + +#include "dfsan.h" +#include "dfsan_flags.h" +#include "dfsan_thread.h" +#include "sanitizer_common/sanitizer_allocator.h" +#include "sanitizer_common/sanitizer_allocator_checks.h" +#include "sanitizer_common/sanitizer_allocator_interface.h" +#include "sanitizer_common/sanitizer_allocator_report.h" +#include "sanitizer_common/sanitizer_errno.h" + +namespace __dfsan { + +struct Metadata { + uptr requested_size; +}; + +struct DFsanMapUnmapCallback { + void OnMap(uptr p, uptr size) const { dfsan_set_label(0, (void *)p, size); } + void OnUnmap(uptr p, uptr size) const { dfsan_set_label(0, (void *)p, size); } +}; + +static const uptr kAllocatorSpace = 0x700000000000ULL; +static const uptr kMaxAllowedMallocSize = 8UL << 30; + +struct AP64 { // Allocator64 parameters. Deliberately using a short name. + static const uptr kSpaceBeg = kAllocatorSpace; + static const uptr kSpaceSize = 0x40000000000; // 4T. + static const uptr kMetadataSize = sizeof(Metadata); + typedef DefaultSizeClassMap SizeClassMap; + typedef DFsanMapUnmapCallback MapUnmapCallback; + static const uptr kFlags = 0; + using AddressSpaceView = LocalAddressSpaceView; +}; + +typedef SizeClassAllocator64<AP64> PrimaryAllocator; + +typedef CombinedAllocator<PrimaryAllocator> Allocator; +typedef Allocator::AllocatorCache AllocatorCache; + +static Allocator allocator; +static AllocatorCache fallback_allocator_cache; +static StaticSpinMutex fallback_mutex; + +static uptr max_malloc_size; + +void dfsan_allocator_init() { + SetAllocatorMayReturnNull(common_flags()->allocator_may_return_null); + allocator.Init(common_flags()->allocator_release_to_os_interval_ms); + if (common_flags()->max_allocation_size_mb) + max_malloc_size = Min(common_flags()->max_allocation_size_mb << 20, + kMaxAllowedMallocSize); + else + max_malloc_size = kMaxAllowedMallocSize; +} + +AllocatorCache *GetAllocatorCache(DFsanThreadLocalMallocStorage *ms) { + CHECK(ms); + CHECK_LE(sizeof(AllocatorCache), sizeof(ms->allocator_cache)); + return reinterpret_cast<AllocatorCache *>(ms->allocator_cache); +} + +void DFsanThreadLocalMallocStorage::CommitBack() { + allocator.SwallowCache(GetAllocatorCache(this)); +} + +static void *DFsanAllocate(uptr size, uptr alignment, bool zeroise) { + if (size > max_malloc_size) { + if (AllocatorMayReturnNull()) { + Report("WARNING: DataflowSanitizer failed to allocate 0x%zx bytes\n", + size); + return nullptr; + } + BufferedStackTrace stack; + ReportAllocationSizeTooBig(size, max_malloc_size, &stack); + } + if (UNLIKELY(IsRssLimitExceeded())) { + if (AllocatorMayReturnNull()) + return nullptr; + BufferedStackTrace stack; + ReportRssLimitExceeded(&stack); + } + DFsanThread *t = GetCurrentThread(); + void *allocated; + if (t) { + AllocatorCache *cache = GetAllocatorCache(&t->malloc_storage()); + allocated = allocator.Allocate(cache, size, alignment); + } else { + SpinMutexLock l(&fallback_mutex); + AllocatorCache *cache = &fallback_allocator_cache; + allocated = allocator.Allocate(cache, size, alignment); + } + if (UNLIKELY(!allocated)) { + SetAllocatorOutOfMemory(); + if (AllocatorMayReturnNull()) + return nullptr; + BufferedStackTrace stack; + ReportOutOfMemory(size, &stack); + } + Metadata *meta = + reinterpret_cast<Metadata *>(allocator.GetMetaData(allocated)); + meta->requested_size = size; + if (zeroise) { + internal_memset(allocated, 0, size); + dfsan_set_label(0, allocated, size); + } else if (flags().zero_in_malloc) { + dfsan_set_label(0, allocated, size); + } + return allocated; +} + +void dfsan_deallocate(void *p) { + CHECK(p); + Metadata *meta = reinterpret_cast<Metadata *>(allocator.GetMetaData(p)); + uptr size = meta->requested_size; + meta->requested_size = 0; + if (flags().zero_in_free) + dfsan_set_label(0, p, size); + DFsanThread *t = GetCurrentThread(); + if (t) { + AllocatorCache *cache = GetAllocatorCache(&t->malloc_storage()); + allocator.Deallocate(cache, p); + } else { + SpinMutexLock l(&fallback_mutex); + AllocatorCache *cache = &fallback_allocator_cache; + allocator.Deallocate(cache, p); + } +} + +void *DFsanReallocate(void *old_p, uptr new_size, uptr alignment) { + Metadata *meta = reinterpret_cast<Metadata *>(allocator.GetMetaData(old_p)); + uptr old_size = meta->requested_size; + uptr actually_allocated_size = allocator.GetActuallyAllocatedSize(old_p); + if (new_size <= actually_allocated_size) { + // We are not reallocating here. + meta->requested_size = new_size; + if (new_size > old_size && flags().zero_in_malloc) + dfsan_set_label(0, (char *)old_p + old_size, new_size - old_size); + return old_p; + } + uptr memcpy_size = Min(new_size, old_size); + void *new_p = DFsanAllocate(new_size, alignment, false /*zeroise*/); + if (new_p) { + dfsan_copy_memory(new_p, old_p, memcpy_size); + dfsan_deallocate(old_p); + } + return new_p; +} + +void *DFsanCalloc(uptr nmemb, uptr size) { + if (UNLIKELY(CheckForCallocOverflow(size, nmemb))) { + if (AllocatorMayReturnNull()) + return nullptr; + BufferedStackTrace stack; + ReportCallocOverflow(nmemb, size, &stack); + } + return DFsanAllocate(nmemb * size, sizeof(u64), true /*zeroise*/); +} + +static uptr AllocationSize(const void *p) { + if (!p) + return 0; + const void *beg = allocator.GetBlockBegin(p); + if (beg != p) + return 0; + Metadata *b = (Metadata *)allocator.GetMetaData(p); + return b->requested_size; +} + +void *dfsan_malloc(uptr size) { + return SetErrnoOnNull(DFsanAllocate(size, sizeof(u64), false /*zeroise*/)); +} + +void *dfsan_calloc(uptr nmemb, uptr size) { + return SetErrnoOnNull(DFsanCalloc(nmemb, size)); +} + +void *dfsan_realloc(void *ptr, uptr size) { + if (!ptr) + return SetErrnoOnNull(DFsanAllocate(size, sizeof(u64), false /*zeroise*/)); + if (size == 0) { + dfsan_deallocate(ptr); + return nullptr; + } + return SetErrnoOnNull(DFsanReallocate(ptr, size, sizeof(u64))); +} + +void *dfsan_reallocarray(void *ptr, uptr nmemb, uptr size) { + if (UNLIKELY(CheckForCallocOverflow(size, nmemb))) { + errno = errno_ENOMEM; + if (AllocatorMayReturnNull()) + return nullptr; + BufferedStackTrace stack; + ReportReallocArrayOverflow(nmemb, size, &stack); + } + return dfsan_realloc(ptr, nmemb * size); +} + +void *dfsan_valloc(uptr size) { + return SetErrnoOnNull( + DFsanAllocate(size, GetPageSizeCached(), false /*zeroise*/)); +} + +void *dfsan_pvalloc(uptr size) { + uptr PageSize = GetPageSizeCached(); + if (UNLIKELY(CheckForPvallocOverflow(size, PageSize))) { + errno = errno_ENOMEM; + if (AllocatorMayReturnNull()) + return nullptr; + BufferedStackTrace stack; + ReportPvallocOverflow(size, &stack); + } + // pvalloc(0) should allocate one page. + size = size ? RoundUpTo(size, PageSize) : PageSize; + return SetErrnoOnNull(DFsanAllocate(size, PageSize, false /*zeroise*/)); +} + +void *dfsan_aligned_alloc(uptr alignment, uptr size) { + if (UNLIKELY(!CheckAlignedAllocAlignmentAndSize(alignment, size))) { + errno = errno_EINVAL; + if (AllocatorMayReturnNull()) + return nullptr; + BufferedStackTrace stack; + ReportInvalidAlignedAllocAlignment(size, alignment, &stack); + } + return SetErrnoOnNull(DFsanAllocate(size, alignment, false /*zeroise*/)); +} + +void *dfsan_memalign(uptr alignment, uptr size) { + if (UNLIKELY(!IsPowerOfTwo(alignment))) { + errno = errno_EINVAL; + if (AllocatorMayReturnNull()) + return nullptr; + BufferedStackTrace stack; + ReportInvalidAllocationAlignment(alignment, &stack); + } + return SetErrnoOnNull(DFsanAllocate(size, alignment, false /*zeroise*/)); +} + +int dfsan_posix_memalign(void **memptr, uptr alignment, uptr size) { + if (UNLIKELY(!CheckPosixMemalignAlignment(alignment))) { + if (AllocatorMayReturnNull()) + return errno_EINVAL; + BufferedStackTrace stack; + ReportInvalidPosixMemalignAlignment(alignment, &stack); + } + void *ptr = DFsanAllocate(size, alignment, false /*zeroise*/); + if (UNLIKELY(!ptr)) + // OOM error is already taken care of by DFsanAllocate. + return errno_ENOMEM; + CHECK(IsAligned((uptr)ptr, alignment)); + *memptr = ptr; + return 0; +} + +} // namespace __dfsan + +using namespace __dfsan; + +uptr __sanitizer_get_current_allocated_bytes() { + uptr stats[AllocatorStatCount]; + allocator.GetStats(stats); + return stats[AllocatorStatAllocated]; +} + +uptr __sanitizer_get_heap_size() { + uptr stats[AllocatorStatCount]; + allocator.GetStats(stats); + return stats[AllocatorStatMapped]; +} + +uptr __sanitizer_get_free_bytes() { return 1; } + +uptr __sanitizer_get_unmapped_bytes() { return 1; } + +uptr __sanitizer_get_estimated_allocated_size(uptr size) { return size; } + +int __sanitizer_get_ownership(const void *p) { return AllocationSize(p) != 0; } + +uptr __sanitizer_get_allocated_size(const void *p) { return AllocationSize(p); } diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_allocator.h b/contrib/libs/clang14-rt/lib/dfsan/dfsan_allocator.h new file mode 100644 index 0000000000..3b4171b631 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_allocator.h @@ -0,0 +1,30 @@ +//===-- dfsan_allocator.h ---------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataflowSanitizer. +// +//===----------------------------------------------------------------------===// + +#ifndef DFSAN_ALLOCATOR_H +#define DFSAN_ALLOCATOR_H + +#include "sanitizer_common/sanitizer_common.h" + +namespace __dfsan { + +struct DFsanThreadLocalMallocStorage { + ALIGNED(8) uptr allocator_cache[96 * (512 * 8 + 16)]; // Opaque. + void CommitBack(); + + private: + // These objects are allocated via mmap() and are zero-initialized. + DFsanThreadLocalMallocStorage() {} +}; + +} // namespace __dfsan +#endif // DFSAN_ALLOCATOR_H diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_chained_origin_depot.cpp b/contrib/libs/clang14-rt/lib/dfsan/dfsan_chained_origin_depot.cpp new file mode 100644 index 0000000000..9ec598bf2c --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_chained_origin_depot.cpp @@ -0,0 +1,22 @@ +//===-- dfsan_chained_origin_depot.cpp ------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +// A storage for chained origins. +//===----------------------------------------------------------------------===// + +#include "dfsan_chained_origin_depot.h" + +namespace __dfsan { + +static ChainedOriginDepot chainedOriginDepot; + +ChainedOriginDepot* GetChainedOriginDepot() { return &chainedOriginDepot; } + +} // namespace __dfsan diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_chained_origin_depot.h b/contrib/libs/clang14-rt/lib/dfsan/dfsan_chained_origin_depot.h new file mode 100644 index 0000000000..d715ef707f --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_chained_origin_depot.h @@ -0,0 +1,26 @@ +//===-- dfsan_chained_origin_depot.h ----------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +// A storage for chained origins. +//===----------------------------------------------------------------------===// + +#ifndef DFSAN_CHAINED_ORIGIN_DEPOT_H +#define DFSAN_CHAINED_ORIGIN_DEPOT_H + +#include "sanitizer_common/sanitizer_chained_origin_depot.h" +#include "sanitizer_common/sanitizer_common.h" + +namespace __dfsan { + +ChainedOriginDepot* GetChainedOriginDepot(); + +} // namespace __dfsan + +#endif // DFSAN_CHAINED_ORIGIN_DEPOT_H diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_custom.cpp b/contrib/libs/clang14-rt/lib/dfsan/dfsan_custom.cpp new file mode 100644 index 0000000000..1965d0ddff --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_custom.cpp @@ -0,0 +1,2529 @@ +//===-- dfsan_custom.cpp --------------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +// This file defines the custom functions listed in done_abilist.txt. +//===----------------------------------------------------------------------===// + +#include <arpa/inet.h> +#include <assert.h> +#include <ctype.h> +#include <dlfcn.h> +#include <link.h> +#include <poll.h> +#include <pthread.h> +#include <pwd.h> +#include <sched.h> +#include <signal.h> +#include <stdarg.h> +#include <stdint.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <sys/epoll.h> +#include <sys/resource.h> +#include <sys/select.h> +#include <sys/socket.h> +#include <sys/stat.h> +#include <sys/time.h> +#include <sys/types.h> +#include <time.h> +#include <unistd.h> + +#include "dfsan/dfsan.h" +#include "dfsan/dfsan_chained_origin_depot.h" +#include "dfsan/dfsan_flags.h" +#include "dfsan/dfsan_thread.h" +#include "sanitizer_common/sanitizer_common.h" +#include "sanitizer_common/sanitizer_internal_defs.h" +#include "sanitizer_common/sanitizer_linux.h" +#include "sanitizer_common/sanitizer_stackdepot.h" + +using namespace __dfsan; + +#define CALL_WEAK_INTERCEPTOR_HOOK(f, ...) \ + do { \ + if (f) \ + f(__VA_ARGS__); \ + } while (false) +#define DECLARE_WEAK_INTERCEPTOR_HOOK(f, ...) \ +SANITIZER_INTERFACE_ATTRIBUTE SANITIZER_WEAK_ATTRIBUTE void f(__VA_ARGS__); + +// Async-safe, non-reentrant spin lock. +class SignalSpinLocker { + public: + SignalSpinLocker() { + sigset_t all_set; + sigfillset(&all_set); + pthread_sigmask(SIG_SETMASK, &all_set, &saved_thread_mask_); + sigactions_mu.Lock(); + } + ~SignalSpinLocker() { + sigactions_mu.Unlock(); + pthread_sigmask(SIG_SETMASK, &saved_thread_mask_, nullptr); + } + + private: + static StaticSpinMutex sigactions_mu; + sigset_t saved_thread_mask_; + + SignalSpinLocker(const SignalSpinLocker &) = delete; + SignalSpinLocker &operator=(const SignalSpinLocker &) = delete; +}; + +StaticSpinMutex SignalSpinLocker::sigactions_mu; + +extern "C" { +SANITIZER_INTERFACE_ATTRIBUTE int +__dfsw_stat(const char *path, struct stat *buf, dfsan_label path_label, + dfsan_label buf_label, dfsan_label *ret_label) { + int ret = stat(path, buf); + if (ret == 0) + dfsan_set_label(0, buf, sizeof(struct stat)); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_stat( + const char *path, struct stat *buf, dfsan_label path_label, + dfsan_label buf_label, dfsan_label *ret_label, dfsan_origin path_origin, + dfsan_origin buf_origin, dfsan_origin *ret_origin) { + int ret = __dfsw_stat(path, buf, path_label, buf_label, ret_label); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_fstat(int fd, struct stat *buf, + dfsan_label fd_label, + dfsan_label buf_label, + dfsan_label *ret_label) { + int ret = fstat(fd, buf); + if (ret == 0) + dfsan_set_label(0, buf, sizeof(struct stat)); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_fstat( + int fd, struct stat *buf, dfsan_label fd_label, dfsan_label buf_label, + dfsan_label *ret_label, dfsan_origin fd_origin, dfsan_origin buf_origin, + dfsan_origin *ret_origin) { + int ret = __dfsw_fstat(fd, buf, fd_label, buf_label, ret_label); + return ret; +} + +static char *dfsan_strchr_with_label(const char *s, int c, size_t *bytes_read, + dfsan_label s_label, dfsan_label c_label, + dfsan_label *ret_label) { + char *match_pos = nullptr; + for (size_t i = 0;; ++i) { + if (s[i] == c || s[i] == 0) { + // If s[i] is the \0 at the end of the string, and \0 is not the + // character we are searching for, then return null. + *bytes_read = i + 1; + match_pos = s[i] == 0 && c != 0 ? nullptr : const_cast<char *>(s + i); + break; + } + } + if (flags().strict_data_dependencies) + *ret_label = s_label; + else + *ret_label = dfsan_union(dfsan_read_label(s, *bytes_read), + dfsan_union(s_label, c_label)); + return match_pos; +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfsw_strchr(const char *s, int c, + dfsan_label s_label, + dfsan_label c_label, + dfsan_label *ret_label) { + size_t bytes_read; + return dfsan_strchr_with_label(s, c, &bytes_read, s_label, c_label, + ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfso_strchr( + const char *s, int c, dfsan_label s_label, dfsan_label c_label, + dfsan_label *ret_label, dfsan_origin s_origin, dfsan_origin c_origin, + dfsan_origin *ret_origin) { + size_t bytes_read; + char *r = + dfsan_strchr_with_label(s, c, &bytes_read, s_label, c_label, ret_label); + if (flags().strict_data_dependencies) { + *ret_origin = s_origin; + } else if (*ret_label) { + dfsan_origin o = dfsan_read_origin_of_first_taint(s, bytes_read); + *ret_origin = o ? o : (s_label ? s_origin : c_origin); + } + return r; +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfsw_strpbrk(const char *s, + const char *accept, + dfsan_label s_label, + dfsan_label accept_label, + dfsan_label *ret_label) { + const char *ret = strpbrk(s, accept); + if (flags().strict_data_dependencies) { + *ret_label = ret ? s_label : 0; + } else { + size_t s_bytes_read = (ret ? ret - s : strlen(s)) + 1; + *ret_label = + dfsan_union(dfsan_read_label(s, s_bytes_read), + dfsan_union(dfsan_read_label(accept, strlen(accept) + 1), + dfsan_union(s_label, accept_label))); + } + return const_cast<char *>(ret); +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfso_strpbrk( + const char *s, const char *accept, dfsan_label s_label, + dfsan_label accept_label, dfsan_label *ret_label, dfsan_origin s_origin, + dfsan_origin accept_origin, dfsan_origin *ret_origin) { + const char *ret = __dfsw_strpbrk(s, accept, s_label, accept_label, ret_label); + if (flags().strict_data_dependencies) { + if (ret) + *ret_origin = s_origin; + } else { + if (*ret_label) { + size_t s_bytes_read = (ret ? ret - s : strlen(s)) + 1; + dfsan_origin o = dfsan_read_origin_of_first_taint(s, s_bytes_read); + if (o) { + *ret_origin = o; + } else { + o = dfsan_read_origin_of_first_taint(accept, strlen(accept) + 1); + *ret_origin = o ? o : (s_label ? s_origin : accept_origin); + } + } + } + return const_cast<char *>(ret); +} + +static int dfsan_memcmp_bcmp(const void *s1, const void *s2, size_t n, + size_t *bytes_read) { + const char *cs1 = (const char *) s1, *cs2 = (const char *) s2; + for (size_t i = 0; i != n; ++i) { + if (cs1[i] != cs2[i]) { + *bytes_read = i + 1; + return cs1[i] - cs2[i]; + } + } + *bytes_read = n; + return 0; +} + +static dfsan_label dfsan_get_memcmp_label(const void *s1, const void *s2, + size_t pos) { + if (flags().strict_data_dependencies) + return 0; + return dfsan_union(dfsan_read_label(s1, pos), dfsan_read_label(s2, pos)); +} + +static void dfsan_get_memcmp_origin(const void *s1, const void *s2, size_t pos, + dfsan_label *ret_label, + dfsan_origin *ret_origin) { + *ret_label = dfsan_get_memcmp_label(s1, s2, pos); + if (*ret_label == 0) + return; + dfsan_origin o = dfsan_read_origin_of_first_taint(s1, pos); + *ret_origin = o ? o : dfsan_read_origin_of_first_taint(s2, pos); +} + +static int dfsan_memcmp_bcmp_label(const void *s1, const void *s2, size_t n, + dfsan_label *ret_label) { + size_t bytes_read; + int r = dfsan_memcmp_bcmp(s1, s2, n, &bytes_read); + *ret_label = dfsan_get_memcmp_label(s1, s2, bytes_read); + return r; +} + +static int dfsan_memcmp_bcmp_origin(const void *s1, const void *s2, size_t n, + dfsan_label *ret_label, + dfsan_origin *ret_origin) { + size_t bytes_read; + int r = dfsan_memcmp_bcmp(s1, s2, n, &bytes_read); + dfsan_get_memcmp_origin(s1, s2, bytes_read, ret_label, ret_origin); + return r; +} + +DECLARE_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_memcmp, uptr caller_pc, + const void *s1, const void *s2, size_t n, + dfsan_label s1_label, dfsan_label s2_label, + dfsan_label n_label) + +DECLARE_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_origin_memcmp, uptr caller_pc, + const void *s1, const void *s2, size_t n, + dfsan_label s1_label, dfsan_label s2_label, + dfsan_label n_label, dfsan_origin s1_origin, + dfsan_origin s2_origin, dfsan_origin n_origin) + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_memcmp(const void *s1, const void *s2, + size_t n, dfsan_label s1_label, + dfsan_label s2_label, + dfsan_label n_label, + dfsan_label *ret_label) { + CALL_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_memcmp, GET_CALLER_PC(), s1, s2, n, + s1_label, s2_label, n_label); + return dfsan_memcmp_bcmp_label(s1, s2, n, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_memcmp( + const void *s1, const void *s2, size_t n, dfsan_label s1_label, + dfsan_label s2_label, dfsan_label n_label, dfsan_label *ret_label, + dfsan_origin s1_origin, dfsan_origin s2_origin, dfsan_origin n_origin, + dfsan_origin *ret_origin) { + CALL_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_origin_memcmp, GET_CALLER_PC(), s1, + s2, n, s1_label, s2_label, n_label, s1_origin, + s2_origin, n_origin); + return dfsan_memcmp_bcmp_origin(s1, s2, n, ret_label, ret_origin); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_bcmp(const void *s1, const void *s2, + size_t n, dfsan_label s1_label, + dfsan_label s2_label, + dfsan_label n_label, + dfsan_label *ret_label) { + return dfsan_memcmp_bcmp_label(s1, s2, n, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_bcmp( + const void *s1, const void *s2, size_t n, dfsan_label s1_label, + dfsan_label s2_label, dfsan_label n_label, dfsan_label *ret_label, + dfsan_origin s1_origin, dfsan_origin s2_origin, dfsan_origin n_origin, + dfsan_origin *ret_origin) { + return dfsan_memcmp_bcmp_origin(s1, s2, n, ret_label, ret_origin); +} + +// When n == 0, compare strings without byte limit. +// When n > 0, compare the first (at most) n bytes of s1 and s2. +static int dfsan_strncmp(const char *s1, const char *s2, size_t n, + size_t *bytes_read) { + for (size_t i = 0;; ++i) { + if (s1[i] != s2[i] || s1[i] == 0 || s2[i] == 0 || (n > 0 && i == n - 1)) { + *bytes_read = i + 1; + return s1[i] - s2[i]; + } + } +} + +DECLARE_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_strcmp, uptr caller_pc, + const char *s1, const char *s2, + dfsan_label s1_label, dfsan_label s2_label) + +DECLARE_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_origin_strcmp, uptr caller_pc, + const char *s1, const char *s2, + dfsan_label s1_label, dfsan_label s2_label, + dfsan_origin s1_origin, dfsan_origin s2_origin) + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_strcmp(const char *s1, const char *s2, + dfsan_label s1_label, + dfsan_label s2_label, + dfsan_label *ret_label) { + CALL_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_strcmp, GET_CALLER_PC(), s1, s2, + s1_label, s2_label); + size_t bytes_read; + int r = dfsan_strncmp(s1, s2, 0, &bytes_read); + *ret_label = dfsan_get_memcmp_label(s1, s2, bytes_read); + return r; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_strcmp( + const char *s1, const char *s2, dfsan_label s1_label, dfsan_label s2_label, + dfsan_label *ret_label, dfsan_origin s1_origin, dfsan_origin s2_origin, + dfsan_origin *ret_origin) { + CALL_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_origin_strcmp, GET_CALLER_PC(), s1, + s2, s1_label, s2_label, s1_origin, s2_origin); + size_t bytes_read; + int r = dfsan_strncmp(s1, s2, 0, &bytes_read); + dfsan_get_memcmp_origin(s1, s2, bytes_read, ret_label, ret_origin); + return r; +} + +// When n == 0, compare strings without byte limit. +// When n > 0, compare the first (at most) n bytes of s1 and s2. +static int dfsan_strncasecmp(const char *s1, const char *s2, size_t n, + size_t *bytes_read) { + for (size_t i = 0;; ++i) { + char s1_lower = tolower(s1[i]); + char s2_lower = tolower(s2[i]); + + if (s1_lower != s2_lower || s1[i] == 0 || s2[i] == 0 || + (n > 0 && i == n - 1)) { + *bytes_read = i + 1; + return s1_lower - s2_lower; + } + } +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_strcasecmp(const char *s1, + const char *s2, + dfsan_label s1_label, + dfsan_label s2_label, + dfsan_label *ret_label) { + size_t bytes_read; + int r = dfsan_strncasecmp(s1, s2, 0, &bytes_read); + *ret_label = dfsan_get_memcmp_label(s1, s2, bytes_read); + return r; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_strcasecmp( + const char *s1, const char *s2, dfsan_label s1_label, dfsan_label s2_label, + dfsan_label *ret_label, dfsan_origin s1_origin, dfsan_origin s2_origin, + dfsan_origin *ret_origin) { + size_t bytes_read; + int r = dfsan_strncasecmp(s1, s2, 0, &bytes_read); + dfsan_get_memcmp_origin(s1, s2, bytes_read, ret_label, ret_origin); + return r; +} + +DECLARE_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_strncmp, uptr caller_pc, + const char *s1, const char *s2, size_t n, + dfsan_label s1_label, dfsan_label s2_label, + dfsan_label n_label) + +DECLARE_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_origin_strncmp, uptr caller_pc, + const char *s1, const char *s2, size_t n, + dfsan_label s1_label, dfsan_label s2_label, + dfsan_label n_label, dfsan_origin s1_origin, + dfsan_origin s2_origin, dfsan_origin n_origin) + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_strncmp(const char *s1, const char *s2, + size_t n, dfsan_label s1_label, + dfsan_label s2_label, + dfsan_label n_label, + dfsan_label *ret_label) { + if (n == 0) { + *ret_label = 0; + return 0; + } + + CALL_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_strncmp, GET_CALLER_PC(), s1, s2, + n, s1_label, s2_label, n_label); + + size_t bytes_read; + int r = dfsan_strncmp(s1, s2, n, &bytes_read); + *ret_label = dfsan_get_memcmp_label(s1, s2, bytes_read); + return r; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_strncmp( + const char *s1, const char *s2, size_t n, dfsan_label s1_label, + dfsan_label s2_label, dfsan_label n_label, dfsan_label *ret_label, + dfsan_origin s1_origin, dfsan_origin s2_origin, dfsan_origin n_origin, + dfsan_origin *ret_origin) { + if (n == 0) { + *ret_label = 0; + return 0; + } + + CALL_WEAK_INTERCEPTOR_HOOK(dfsan_weak_hook_origin_strncmp, GET_CALLER_PC(), + s1, s2, n, s1_label, s2_label, n_label, s1_origin, + s2_origin, n_origin); + + size_t bytes_read; + int r = dfsan_strncmp(s1, s2, n, &bytes_read); + dfsan_get_memcmp_origin(s1, s2, bytes_read, ret_label, ret_origin); + return r; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_strncasecmp( + const char *s1, const char *s2, size_t n, dfsan_label s1_label, + dfsan_label s2_label, dfsan_label n_label, dfsan_label *ret_label) { + if (n == 0) { + *ret_label = 0; + return 0; + } + + size_t bytes_read; + int r = dfsan_strncasecmp(s1, s2, n, &bytes_read); + *ret_label = dfsan_get_memcmp_label(s1, s2, bytes_read); + return r; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_strncasecmp( + const char *s1, const char *s2, size_t n, dfsan_label s1_label, + dfsan_label s2_label, dfsan_label n_label, dfsan_label *ret_label, + dfsan_origin s1_origin, dfsan_origin s2_origin, dfsan_origin n_origin, + dfsan_origin *ret_origin) { + if (n == 0) { + *ret_label = 0; + return 0; + } + + size_t bytes_read; + int r = dfsan_strncasecmp(s1, s2, n, &bytes_read); + dfsan_get_memcmp_origin(s1, s2, bytes_read, ret_label, ret_origin); + return r; +} + + +SANITIZER_INTERFACE_ATTRIBUTE size_t +__dfsw_strlen(const char *s, dfsan_label s_label, dfsan_label *ret_label) { + size_t ret = strlen(s); + if (flags().strict_data_dependencies) { + *ret_label = 0; + } else { + *ret_label = dfsan_read_label(s, ret + 1); + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE size_t __dfso_strlen(const char *s, + dfsan_label s_label, + dfsan_label *ret_label, + dfsan_origin s_origin, + dfsan_origin *ret_origin) { + size_t ret = __dfsw_strlen(s, s_label, ret_label); + if (!flags().strict_data_dependencies) + *ret_origin = dfsan_read_origin_of_first_taint(s, ret + 1); + return ret; +} + +static void *dfsan_memmove(void *dest, const void *src, size_t n) { + dfsan_label *sdest = shadow_for(dest); + const dfsan_label *ssrc = shadow_for(src); + internal_memmove((void *)sdest, (const void *)ssrc, n * sizeof(dfsan_label)); + return internal_memmove(dest, src, n); +} + +static void *dfsan_memmove_with_origin(void *dest, const void *src, size_t n) { + dfsan_mem_origin_transfer(dest, src, n); + return dfsan_memmove(dest, src, n); +} + +static void *dfsan_memcpy(void *dest, const void *src, size_t n) { + dfsan_mem_shadow_transfer(dest, src, n); + return internal_memcpy(dest, src, n); +} + +static void *dfsan_memcpy_with_origin(void *dest, const void *src, size_t n) { + dfsan_mem_origin_transfer(dest, src, n); + return dfsan_memcpy(dest, src, n); +} + +static void dfsan_memset(void *s, int c, dfsan_label c_label, size_t n) { + internal_memset(s, c, n); + dfsan_set_label(c_label, s, n); +} + +static void dfsan_memset_with_origin(void *s, int c, dfsan_label c_label, + dfsan_origin c_origin, size_t n) { + internal_memset(s, c, n); + dfsan_set_label_origin(c_label, c_origin, s, n); +} + +SANITIZER_INTERFACE_ATTRIBUTE +void *__dfsw_memcpy(void *dest, const void *src, size_t n, + dfsan_label dest_label, dfsan_label src_label, + dfsan_label n_label, dfsan_label *ret_label) { + *ret_label = dest_label; + return dfsan_memcpy(dest, src, n); +} + +SANITIZER_INTERFACE_ATTRIBUTE +void *__dfso_memcpy(void *dest, const void *src, size_t n, + dfsan_label dest_label, dfsan_label src_label, + dfsan_label n_label, dfsan_label *ret_label, + dfsan_origin dest_origin, dfsan_origin src_origin, + dfsan_origin n_origin, dfsan_origin *ret_origin) { + *ret_label = dest_label; + *ret_origin = dest_origin; + return dfsan_memcpy_with_origin(dest, src, n); +} + +SANITIZER_INTERFACE_ATTRIBUTE +void *__dfsw_memmove(void *dest, const void *src, size_t n, + dfsan_label dest_label, dfsan_label src_label, + dfsan_label n_label, dfsan_label *ret_label) { + *ret_label = dest_label; + return dfsan_memmove(dest, src, n); +} + +SANITIZER_INTERFACE_ATTRIBUTE +void *__dfso_memmove(void *dest, const void *src, size_t n, + dfsan_label dest_label, dfsan_label src_label, + dfsan_label n_label, dfsan_label *ret_label, + dfsan_origin dest_origin, dfsan_origin src_origin, + dfsan_origin n_origin, dfsan_origin *ret_origin) { + *ret_label = dest_label; + *ret_origin = dest_origin; + return dfsan_memmove_with_origin(dest, src, n); +} + +SANITIZER_INTERFACE_ATTRIBUTE +void *__dfsw_memset(void *s, int c, size_t n, + dfsan_label s_label, dfsan_label c_label, + dfsan_label n_label, dfsan_label *ret_label) { + dfsan_memset(s, c, c_label, n); + *ret_label = s_label; + return s; +} + +SANITIZER_INTERFACE_ATTRIBUTE +void *__dfso_memset(void *s, int c, size_t n, dfsan_label s_label, + dfsan_label c_label, dfsan_label n_label, + dfsan_label *ret_label, dfsan_origin s_origin, + dfsan_origin c_origin, dfsan_origin n_origin, + dfsan_origin *ret_origin) { + dfsan_memset_with_origin(s, c, c_label, c_origin, n); + *ret_label = s_label; + *ret_origin = s_origin; + return s; +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfsw_strcat(char *dest, const char *src, + dfsan_label dest_label, + dfsan_label src_label, + dfsan_label *ret_label) { + size_t dest_len = strlen(dest); + char *ret = strcat(dest, src); + dfsan_mem_shadow_transfer(dest + dest_len, src, strlen(src)); + *ret_label = dest_label; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfso_strcat( + char *dest, const char *src, dfsan_label dest_label, dfsan_label src_label, + dfsan_label *ret_label, dfsan_origin dest_origin, dfsan_origin src_origin, + dfsan_origin *ret_origin) { + size_t dest_len = strlen(dest); + char *ret = strcat(dest, src); + size_t src_len = strlen(src); + dfsan_mem_origin_transfer(dest + dest_len, src, src_len); + dfsan_mem_shadow_transfer(dest + dest_len, src, src_len); + *ret_label = dest_label; + *ret_origin = dest_origin; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE char * +__dfsw_strdup(const char *s, dfsan_label s_label, dfsan_label *ret_label) { + size_t len = strlen(s); + void *p = malloc(len+1); + dfsan_memcpy(p, s, len+1); + *ret_label = 0; + return static_cast<char *>(p); +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfso_strdup(const char *s, + dfsan_label s_label, + dfsan_label *ret_label, + dfsan_origin s_origin, + dfsan_origin *ret_origin) { + size_t len = strlen(s); + void *p = malloc(len + 1); + dfsan_memcpy_with_origin(p, s, len + 1); + *ret_label = 0; + return static_cast<char *>(p); +} + +SANITIZER_INTERFACE_ATTRIBUTE char * +__dfsw_strncpy(char *s1, const char *s2, size_t n, dfsan_label s1_label, + dfsan_label s2_label, dfsan_label n_label, + dfsan_label *ret_label) { + size_t len = strlen(s2); + if (len < n) { + dfsan_memcpy(s1, s2, len+1); + dfsan_memset(s1+len+1, 0, 0, n-len-1); + } else { + dfsan_memcpy(s1, s2, n); + } + + *ret_label = s1_label; + return s1; +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfso_strncpy( + char *s1, const char *s2, size_t n, dfsan_label s1_label, + dfsan_label s2_label, dfsan_label n_label, dfsan_label *ret_label, + dfsan_origin s1_origin, dfsan_origin s2_origin, dfsan_origin n_origin, + dfsan_origin *ret_origin) { + size_t len = strlen(s2); + if (len < n) { + dfsan_memcpy_with_origin(s1, s2, len + 1); + dfsan_memset_with_origin(s1 + len + 1, 0, 0, 0, n - len - 1); + } else { + dfsan_memcpy_with_origin(s1, s2, n); + } + + *ret_label = s1_label; + *ret_origin = s1_origin; + return s1; +} + +SANITIZER_INTERFACE_ATTRIBUTE ssize_t +__dfsw_pread(int fd, void *buf, size_t count, off_t offset, + dfsan_label fd_label, dfsan_label buf_label, + dfsan_label count_label, dfsan_label offset_label, + dfsan_label *ret_label) { + ssize_t ret = pread(fd, buf, count, offset); + if (ret > 0) + dfsan_set_label(0, buf, ret); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE ssize_t __dfso_pread( + int fd, void *buf, size_t count, off_t offset, dfsan_label fd_label, + dfsan_label buf_label, dfsan_label count_label, dfsan_label offset_label, + dfsan_label *ret_label, dfsan_origin fd_origin, dfsan_origin buf_origin, + dfsan_origin count_origin, dfsan_label offset_origin, + dfsan_origin *ret_origin) { + return __dfsw_pread(fd, buf, count, offset, fd_label, buf_label, count_label, + offset_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE ssize_t +__dfsw_read(int fd, void *buf, size_t count, + dfsan_label fd_label, dfsan_label buf_label, + dfsan_label count_label, + dfsan_label *ret_label) { + ssize_t ret = read(fd, buf, count); + if (ret > 0) + dfsan_set_label(0, buf, ret); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE ssize_t __dfso_read( + int fd, void *buf, size_t count, dfsan_label fd_label, + dfsan_label buf_label, dfsan_label count_label, dfsan_label *ret_label, + dfsan_origin fd_origin, dfsan_origin buf_origin, dfsan_origin count_origin, + dfsan_origin *ret_origin) { + return __dfsw_read(fd, buf, count, fd_label, buf_label, count_label, + ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_clock_gettime(clockid_t clk_id, + struct timespec *tp, + dfsan_label clk_id_label, + dfsan_label tp_label, + dfsan_label *ret_label) { + int ret = clock_gettime(clk_id, tp); + if (ret == 0) + dfsan_set_label(0, tp, sizeof(struct timespec)); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_clock_gettime( + clockid_t clk_id, struct timespec *tp, dfsan_label clk_id_label, + dfsan_label tp_label, dfsan_label *ret_label, dfsan_origin clk_id_origin, + dfsan_origin tp_origin, dfsan_origin *ret_origin) { + return __dfsw_clock_gettime(clk_id, tp, clk_id_label, tp_label, ret_label); +} + +static void dfsan_set_zero_label(const void *ptr, uptr size) { + dfsan_set_label(0, const_cast<void *>(ptr), size); +} + +// dlopen() ultimately calls mmap() down inside the loader, which generally +// doesn't participate in dynamic symbol resolution. Therefore we won't +// intercept its calls to mmap, and we have to hook it here. +SANITIZER_INTERFACE_ATTRIBUTE void * +__dfsw_dlopen(const char *filename, int flag, dfsan_label filename_label, + dfsan_label flag_label, dfsan_label *ret_label) { + void *handle = dlopen(filename, flag); + link_map *map = GET_LINK_MAP_BY_DLOPEN_HANDLE(handle); + if (map) + ForEachMappedRegion(map, dfsan_set_zero_label); + *ret_label = 0; + return handle; +} + +SANITIZER_INTERFACE_ATTRIBUTE void *__dfso_dlopen( + const char *filename, int flag, dfsan_label filename_label, + dfsan_label flag_label, dfsan_label *ret_label, + dfsan_origin filename_origin, dfsan_origin flag_origin, + dfsan_origin *ret_origin) { + return __dfsw_dlopen(filename, flag, filename_label, flag_label, ret_label); +} + +static void *DFsanThreadStartFunc(void *arg) { + DFsanThread *t = (DFsanThread *)arg; + SetCurrentThread(t); + t->Init(); + SetSigProcMask(&t->starting_sigset_, nullptr); + return t->ThreadStart(); +} + +static int dfsan_pthread_create(pthread_t *thread, const pthread_attr_t *attr, + void *start_routine_trampoline, + void *start_routine, void *arg, + dfsan_label *ret_label, + bool track_origins = false) { + pthread_attr_t myattr; + if (!attr) { + pthread_attr_init(&myattr); + attr = &myattr; + } + + // Ensure that the thread stack is large enough to hold all TLS data. + AdjustStackSize((void *)(const_cast<pthread_attr_t *>(attr))); + + DFsanThread *t = + DFsanThread::Create(start_routine_trampoline, + (thread_callback_t)start_routine, arg, track_origins); + ScopedBlockSignals block(&t->starting_sigset_); + int res = pthread_create(thread, attr, DFsanThreadStartFunc, t); + + if (attr == &myattr) + pthread_attr_destroy(&myattr); + *ret_label = 0; + return res; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_pthread_create( + pthread_t *thread, const pthread_attr_t *attr, + void *(*start_routine_trampoline)(void *, void *, dfsan_label, + dfsan_label *), + void *start_routine, void *arg, dfsan_label thread_label, + dfsan_label attr_label, dfsan_label start_routine_label, + dfsan_label arg_label, dfsan_label *ret_label) { + return dfsan_pthread_create(thread, attr, (void *)start_routine_trampoline, + start_routine, arg, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_pthread_create( + pthread_t *thread, const pthread_attr_t *attr, + void *(*start_routine_trampoline)(void *, void *, dfsan_label, + dfsan_label *, dfsan_origin, + dfsan_origin *), + void *start_routine, void *arg, dfsan_label thread_label, + dfsan_label attr_label, dfsan_label start_routine_label, + dfsan_label arg_label, dfsan_label *ret_label, dfsan_origin thread_origin, + dfsan_origin attr_origin, dfsan_origin start_routine_origin, + dfsan_origin arg_origin, dfsan_origin *ret_origin) { + return dfsan_pthread_create(thread, attr, (void *)start_routine_trampoline, + start_routine, arg, ret_label, true); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_pthread_join(pthread_t thread, + void **retval, + dfsan_label thread_label, + dfsan_label retval_label, + dfsan_label *ret_label) { + int ret = pthread_join(thread, retval); + if (ret == 0 && retval) + dfsan_set_label(0, retval, sizeof(*retval)); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_pthread_join( + pthread_t thread, void **retval, dfsan_label thread_label, + dfsan_label retval_label, dfsan_label *ret_label, + dfsan_origin thread_origin, dfsan_origin retval_origin, + dfsan_origin *ret_origin) { + return __dfsw_pthread_join(thread, retval, thread_label, retval_label, + ret_label); +} + +struct dl_iterate_phdr_info { + int (*callback_trampoline)(void *callback, struct dl_phdr_info *info, + size_t size, void *data, dfsan_label info_label, + dfsan_label size_label, dfsan_label data_label, + dfsan_label *ret_label); + void *callback; + void *data; +}; + +struct dl_iterate_phdr_origin_info { + int (*callback_trampoline)(void *callback, struct dl_phdr_info *info, + size_t size, void *data, dfsan_label info_label, + dfsan_label size_label, dfsan_label data_label, + dfsan_label *ret_label, dfsan_origin info_origin, + dfsan_origin size_origin, dfsan_origin data_origin, + dfsan_origin *ret_origin); + void *callback; + void *data; +}; + +int dl_iterate_phdr_cb(struct dl_phdr_info *info, size_t size, void *data) { + dl_iterate_phdr_info *dipi = (dl_iterate_phdr_info *)data; + dfsan_set_label(0, *info); + dfsan_set_label(0, const_cast<char *>(info->dlpi_name), + strlen(info->dlpi_name) + 1); + dfsan_set_label( + 0, const_cast<char *>(reinterpret_cast<const char *>(info->dlpi_phdr)), + sizeof(*info->dlpi_phdr) * info->dlpi_phnum); + dfsan_label ret_label; + return dipi->callback_trampoline(dipi->callback, info, size, dipi->data, 0, 0, + 0, &ret_label); +} + +int dl_iterate_phdr_origin_cb(struct dl_phdr_info *info, size_t size, + void *data) { + dl_iterate_phdr_origin_info *dipi = (dl_iterate_phdr_origin_info *)data; + dfsan_set_label(0, *info); + dfsan_set_label(0, const_cast<char *>(info->dlpi_name), + strlen(info->dlpi_name) + 1); + dfsan_set_label( + 0, const_cast<char *>(reinterpret_cast<const char *>(info->dlpi_phdr)), + sizeof(*info->dlpi_phdr) * info->dlpi_phnum); + dfsan_label ret_label; + dfsan_origin ret_origin; + return dipi->callback_trampoline(dipi->callback, info, size, dipi->data, 0, 0, + 0, &ret_label, 0, 0, 0, &ret_origin); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_dl_iterate_phdr( + int (*callback_trampoline)(void *callback, struct dl_phdr_info *info, + size_t size, void *data, dfsan_label info_label, + dfsan_label size_label, dfsan_label data_label, + dfsan_label *ret_label), + void *callback, void *data, dfsan_label callback_label, + dfsan_label data_label, dfsan_label *ret_label) { + dl_iterate_phdr_info dipi = { callback_trampoline, callback, data }; + *ret_label = 0; + return dl_iterate_phdr(dl_iterate_phdr_cb, &dipi); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_dl_iterate_phdr( + int (*callback_trampoline)(void *callback, struct dl_phdr_info *info, + size_t size, void *data, dfsan_label info_label, + dfsan_label size_label, dfsan_label data_label, + dfsan_label *ret_label, dfsan_origin info_origin, + dfsan_origin size_origin, + dfsan_origin data_origin, + dfsan_origin *ret_origin), + void *callback, void *data, dfsan_label callback_label, + dfsan_label data_label, dfsan_label *ret_label, + dfsan_origin callback_origin, dfsan_origin data_origin, + dfsan_origin *ret_origin) { + dl_iterate_phdr_origin_info dipi = {callback_trampoline, callback, data}; + *ret_label = 0; + return dl_iterate_phdr(dl_iterate_phdr_origin_cb, &dipi); +} + +// This function is only available for glibc 2.27 or newer. Mark it weak so +// linking succeeds with older glibcs. +SANITIZER_WEAK_ATTRIBUTE void _dl_get_tls_static_info(size_t *sizep, + size_t *alignp); + +SANITIZER_INTERFACE_ATTRIBUTE void __dfsw__dl_get_tls_static_info( + size_t *sizep, size_t *alignp, dfsan_label sizep_label, + dfsan_label alignp_label) { + assert(_dl_get_tls_static_info); + _dl_get_tls_static_info(sizep, alignp); + dfsan_set_label(0, sizep, sizeof(*sizep)); + dfsan_set_label(0, alignp, sizeof(*alignp)); +} + +SANITIZER_INTERFACE_ATTRIBUTE void __dfso__dl_get_tls_static_info( + size_t *sizep, size_t *alignp, dfsan_label sizep_label, + dfsan_label alignp_label, dfsan_origin sizep_origin, + dfsan_origin alignp_origin) { + __dfsw__dl_get_tls_static_info(sizep, alignp, sizep_label, alignp_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfsw_ctime_r(const time_t *timep, char *buf, dfsan_label timep_label, + dfsan_label buf_label, dfsan_label *ret_label) { + char *ret = ctime_r(timep, buf); + if (ret) { + dfsan_set_label(dfsan_read_label(timep, sizeof(time_t)), buf, + strlen(buf) + 1); + *ret_label = buf_label; + } else { + *ret_label = 0; + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfso_ctime_r(const time_t *timep, char *buf, dfsan_label timep_label, + dfsan_label buf_label, dfsan_label *ret_label, + dfsan_origin timep_origin, dfsan_origin buf_origin, + dfsan_origin *ret_origin) { + char *ret = ctime_r(timep, buf); + if (ret) { + dfsan_set_label_origin( + dfsan_read_label(timep, sizeof(time_t)), + dfsan_read_origin_of_first_taint(timep, sizeof(time_t)), buf, + strlen(buf) + 1); + *ret_label = buf_label; + *ret_origin = buf_origin; + } else { + *ret_label = 0; + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfsw_fgets(char *s, int size, FILE *stream, dfsan_label s_label, + dfsan_label size_label, dfsan_label stream_label, + dfsan_label *ret_label) { + char *ret = fgets(s, size, stream); + if (ret) { + dfsan_set_label(0, ret, strlen(ret) + 1); + *ret_label = s_label; + } else { + *ret_label = 0; + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfso_fgets(char *s, int size, FILE *stream, dfsan_label s_label, + dfsan_label size_label, dfsan_label stream_label, + dfsan_label *ret_label, dfsan_origin s_origin, + dfsan_origin size_origin, dfsan_origin stream_origin, + dfsan_origin *ret_origin) { + char *ret = __dfsw_fgets(s, size, stream, s_label, size_label, stream_label, + ret_label); + if (ret) + *ret_origin = s_origin; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfsw_getcwd(char *buf, size_t size, dfsan_label buf_label, + dfsan_label size_label, dfsan_label *ret_label) { + char *ret = getcwd(buf, size); + if (ret) { + dfsan_set_label(0, ret, strlen(ret) + 1); + *ret_label = buf_label; + } else { + *ret_label = 0; + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfso_getcwd(char *buf, size_t size, dfsan_label buf_label, + dfsan_label size_label, dfsan_label *ret_label, + dfsan_origin buf_origin, dfsan_origin size_origin, + dfsan_origin *ret_origin) { + char *ret = __dfsw_getcwd(buf, size, buf_label, size_label, ret_label); + if (ret) + *ret_origin = buf_origin; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfsw_get_current_dir_name(dfsan_label *ret_label) { + char *ret = get_current_dir_name(); + if (ret) + dfsan_set_label(0, ret, strlen(ret) + 1); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfso_get_current_dir_name(dfsan_label *ret_label, + dfsan_origin *ret_origin) { + return __dfsw_get_current_dir_name(ret_label); +} + +// This function is only available for glibc 2.25 or newer. Mark it weak so +// linking succeeds with older glibcs. +SANITIZER_WEAK_ATTRIBUTE int getentropy(void *buffer, size_t length); + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_getentropy(void *buffer, size_t length, + dfsan_label buffer_label, + dfsan_label length_label, + dfsan_label *ret_label) { + int ret = getentropy(buffer, length); + if (ret == 0) { + dfsan_set_label(0, buffer, length); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_getentropy(void *buffer, size_t length, + dfsan_label buffer_label, + dfsan_label length_label, + dfsan_label *ret_label, + dfsan_origin buffer_origin, + dfsan_origin length_origin, + dfsan_origin *ret_origin) { + return __dfsw_getentropy(buffer, length, buffer_label, length_label, + ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_gethostname(char *name, size_t len, dfsan_label name_label, + dfsan_label len_label, dfsan_label *ret_label) { + int ret = gethostname(name, len); + if (ret == 0) { + dfsan_set_label(0, name, strlen(name) + 1); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_gethostname(char *name, size_t len, dfsan_label name_label, + dfsan_label len_label, dfsan_label *ret_label, + dfsan_origin name_origin, dfsan_origin len_origin, + dfsan_label *ret_origin) { + return __dfsw_gethostname(name, len, name_label, len_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_getrlimit(int resource, struct rlimit *rlim, + dfsan_label resource_label, dfsan_label rlim_label, + dfsan_label *ret_label) { + int ret = getrlimit(resource, rlim); + if (ret == 0) { + dfsan_set_label(0, rlim, sizeof(struct rlimit)); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_getrlimit(int resource, struct rlimit *rlim, + dfsan_label resource_label, dfsan_label rlim_label, + dfsan_label *ret_label, dfsan_origin resource_origin, + dfsan_origin rlim_origin, dfsan_origin *ret_origin) { + return __dfsw_getrlimit(resource, rlim, resource_label, rlim_label, + ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_getrusage(int who, struct rusage *usage, dfsan_label who_label, + dfsan_label usage_label, dfsan_label *ret_label) { + int ret = getrusage(who, usage); + if (ret == 0) { + dfsan_set_label(0, usage, sizeof(struct rusage)); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_getrusage(int who, struct rusage *usage, dfsan_label who_label, + dfsan_label usage_label, dfsan_label *ret_label, + dfsan_origin who_origin, dfsan_origin usage_origin, + dfsan_label *ret_origin) { + return __dfsw_getrusage(who, usage, who_label, usage_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfsw_strcpy(char *dest, const char *src, dfsan_label dst_label, + dfsan_label src_label, dfsan_label *ret_label) { + char *ret = strcpy(dest, src); + if (ret) { + dfsan_mem_shadow_transfer(dest, src, strlen(src) + 1); + } + *ret_label = dst_label; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +char *__dfso_strcpy(char *dest, const char *src, dfsan_label dst_label, + dfsan_label src_label, dfsan_label *ret_label, + dfsan_origin dst_origin, dfsan_origin src_origin, + dfsan_origin *ret_origin) { + char *ret = strcpy(dest, src); + if (ret) { + size_t str_len = strlen(src) + 1; + dfsan_mem_origin_transfer(dest, src, str_len); + dfsan_mem_shadow_transfer(dest, src, str_len); + } + *ret_label = dst_label; + *ret_origin = dst_origin; + return ret; +} + +static long int dfsan_strtol(const char *nptr, char **endptr, int base, + char **tmp_endptr) { + assert(tmp_endptr); + long int ret = strtol(nptr, tmp_endptr, base); + if (endptr) + *endptr = *tmp_endptr; + return ret; +} + +static void dfsan_strtolong_label(const char *nptr, const char *tmp_endptr, + dfsan_label base_label, + dfsan_label *ret_label) { + if (tmp_endptr > nptr) { + // If *tmp_endptr is '\0' include its label as well. + *ret_label = dfsan_union( + base_label, + dfsan_read_label(nptr, tmp_endptr - nptr + (*tmp_endptr ? 0 : 1))); + } else { + *ret_label = 0; + } +} + +static void dfsan_strtolong_origin(const char *nptr, const char *tmp_endptr, + dfsan_label base_label, + dfsan_label *ret_label, + dfsan_origin base_origin, + dfsan_origin *ret_origin) { + if (tmp_endptr > nptr) { + // When multiple inputs are tainted, we propagate one of its origins. + // Because checking if base_label is tainted does not need additional + // computation, we prefer to propagating base_origin. + *ret_origin = base_label + ? base_origin + : dfsan_read_origin_of_first_taint( + nptr, tmp_endptr - nptr + (*tmp_endptr ? 0 : 1)); + } +} + +SANITIZER_INTERFACE_ATTRIBUTE +long int __dfsw_strtol(const char *nptr, char **endptr, int base, + dfsan_label nptr_label, dfsan_label endptr_label, + dfsan_label base_label, dfsan_label *ret_label) { + char *tmp_endptr; + long int ret = dfsan_strtol(nptr, endptr, base, &tmp_endptr); + dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +long int __dfso_strtol(const char *nptr, char **endptr, int base, + dfsan_label nptr_label, dfsan_label endptr_label, + dfsan_label base_label, dfsan_label *ret_label, + dfsan_origin nptr_origin, dfsan_origin endptr_origin, + dfsan_origin base_origin, dfsan_origin *ret_origin) { + char *tmp_endptr; + long int ret = dfsan_strtol(nptr, endptr, base, &tmp_endptr); + dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label); + dfsan_strtolong_origin(nptr, tmp_endptr, base_label, ret_label, base_origin, + ret_origin); + return ret; +} + +static double dfsan_strtod(const char *nptr, char **endptr, char **tmp_endptr) { + assert(tmp_endptr); + double ret = strtod(nptr, tmp_endptr); + if (endptr) + *endptr = *tmp_endptr; + return ret; +} + +static void dfsan_strtod_label(const char *nptr, const char *tmp_endptr, + dfsan_label *ret_label) { + if (tmp_endptr > nptr) { + // If *tmp_endptr is '\0' include its label as well. + *ret_label = dfsan_read_label( + nptr, + tmp_endptr - nptr + (*tmp_endptr ? 0 : 1)); + } else { + *ret_label = 0; + } +} + +SANITIZER_INTERFACE_ATTRIBUTE +double __dfsw_strtod(const char *nptr, char **endptr, dfsan_label nptr_label, + dfsan_label endptr_label, dfsan_label *ret_label) { + char *tmp_endptr; + double ret = dfsan_strtod(nptr, endptr, &tmp_endptr); + dfsan_strtod_label(nptr, tmp_endptr, ret_label); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +double __dfso_strtod(const char *nptr, char **endptr, dfsan_label nptr_label, + dfsan_label endptr_label, dfsan_label *ret_label, + dfsan_origin nptr_origin, dfsan_origin endptr_origin, + dfsan_origin *ret_origin) { + char *tmp_endptr; + double ret = dfsan_strtod(nptr, endptr, &tmp_endptr); + dfsan_strtod_label(nptr, tmp_endptr, ret_label); + if (tmp_endptr > nptr) { + // If *tmp_endptr is '\0' include its label as well. + *ret_origin = dfsan_read_origin_of_first_taint( + nptr, tmp_endptr - nptr + (*tmp_endptr ? 0 : 1)); + } else { + *ret_origin = 0; + } + return ret; +} + +static long long int dfsan_strtoll(const char *nptr, char **endptr, int base, + char **tmp_endptr) { + assert(tmp_endptr); + long long int ret = strtoll(nptr, tmp_endptr, base); + if (endptr) + *endptr = *tmp_endptr; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +long long int __dfsw_strtoll(const char *nptr, char **endptr, int base, + dfsan_label nptr_label, dfsan_label endptr_label, + dfsan_label base_label, dfsan_label *ret_label) { + char *tmp_endptr; + long long int ret = dfsan_strtoll(nptr, endptr, base, &tmp_endptr); + dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +long long int __dfso_strtoll(const char *nptr, char **endptr, int base, + dfsan_label nptr_label, dfsan_label endptr_label, + dfsan_label base_label, dfsan_label *ret_label, + dfsan_origin nptr_origin, + dfsan_origin endptr_origin, + dfsan_origin base_origin, + dfsan_origin *ret_origin) { + char *tmp_endptr; + long long int ret = dfsan_strtoll(nptr, endptr, base, &tmp_endptr); + dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label); + dfsan_strtolong_origin(nptr, tmp_endptr, base_label, ret_label, base_origin, + ret_origin); + return ret; +} + +static unsigned long int dfsan_strtoul(const char *nptr, char **endptr, + int base, char **tmp_endptr) { + assert(tmp_endptr); + unsigned long int ret = strtoul(nptr, tmp_endptr, base); + if (endptr) + *endptr = *tmp_endptr; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +unsigned long int __dfsw_strtoul(const char *nptr, char **endptr, int base, + dfsan_label nptr_label, dfsan_label endptr_label, + dfsan_label base_label, dfsan_label *ret_label) { + char *tmp_endptr; + unsigned long int ret = dfsan_strtoul(nptr, endptr, base, &tmp_endptr); + dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +unsigned long int __dfso_strtoul( + const char *nptr, char **endptr, int base, dfsan_label nptr_label, + dfsan_label endptr_label, dfsan_label base_label, dfsan_label *ret_label, + dfsan_origin nptr_origin, dfsan_origin endptr_origin, + dfsan_origin base_origin, dfsan_origin *ret_origin) { + char *tmp_endptr; + unsigned long int ret = dfsan_strtoul(nptr, endptr, base, &tmp_endptr); + dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label); + dfsan_strtolong_origin(nptr, tmp_endptr, base_label, ret_label, base_origin, + ret_origin); + return ret; +} + +static long long unsigned int dfsan_strtoull(const char *nptr, char **endptr, + int base, char **tmp_endptr) { + assert(tmp_endptr); + long long unsigned int ret = strtoull(nptr, tmp_endptr, base); + if (endptr) + *endptr = *tmp_endptr; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +long long unsigned int __dfsw_strtoull(const char *nptr, char **endptr, + int base, dfsan_label nptr_label, + dfsan_label endptr_label, + dfsan_label base_label, + dfsan_label *ret_label) { + char *tmp_endptr; + long long unsigned int ret = dfsan_strtoull(nptr, endptr, base, &tmp_endptr); + dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +long long unsigned int __dfso_strtoull( + const char *nptr, char **endptr, int base, dfsan_label nptr_label, + dfsan_label endptr_label, dfsan_label base_label, dfsan_label *ret_label, + dfsan_origin nptr_origin, dfsan_origin endptr_origin, + dfsan_origin base_origin, dfsan_origin *ret_origin) { + char *tmp_endptr; + long long unsigned int ret = dfsan_strtoull(nptr, endptr, base, &tmp_endptr); + dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label); + dfsan_strtolong_origin(nptr, tmp_endptr, base_label, ret_label, base_origin, + ret_origin); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +time_t __dfsw_time(time_t *t, dfsan_label t_label, dfsan_label *ret_label) { + time_t ret = time(t); + if (ret != (time_t) -1 && t) { + dfsan_set_label(0, t, sizeof(time_t)); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +time_t __dfso_time(time_t *t, dfsan_label t_label, dfsan_label *ret_label, + dfsan_origin t_origin, dfsan_origin *ret_origin) { + return __dfsw_time(t, t_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_inet_pton(int af, const char *src, void *dst, dfsan_label af_label, + dfsan_label src_label, dfsan_label dst_label, + dfsan_label *ret_label) { + int ret = inet_pton(af, src, dst); + if (ret == 1) { + dfsan_set_label(dfsan_read_label(src, strlen(src) + 1), dst, + af == AF_INET ? sizeof(struct in_addr) : sizeof(in6_addr)); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_inet_pton(int af, const char *src, void *dst, dfsan_label af_label, + dfsan_label src_label, dfsan_label dst_label, + dfsan_label *ret_label, dfsan_origin af_origin, + dfsan_origin src_origin, dfsan_origin dst_origin, + dfsan_origin *ret_origin) { + int ret = inet_pton(af, src, dst); + if (ret == 1) { + int src_len = strlen(src) + 1; + dfsan_set_label_origin( + dfsan_read_label(src, src_len), + dfsan_read_origin_of_first_taint(src, src_len), dst, + af == AF_INET ? sizeof(struct in_addr) : sizeof(in6_addr)); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +struct tm *__dfsw_localtime_r(const time_t *timep, struct tm *result, + dfsan_label timep_label, dfsan_label result_label, + dfsan_label *ret_label) { + struct tm *ret = localtime_r(timep, result); + if (ret) { + dfsan_set_label(dfsan_read_label(timep, sizeof(time_t)), result, + sizeof(struct tm)); + *ret_label = result_label; + } else { + *ret_label = 0; + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +struct tm *__dfso_localtime_r(const time_t *timep, struct tm *result, + dfsan_label timep_label, dfsan_label result_label, + dfsan_label *ret_label, dfsan_origin timep_origin, + dfsan_origin result_origin, + dfsan_origin *ret_origin) { + struct tm *ret = localtime_r(timep, result); + if (ret) { + dfsan_set_label_origin( + dfsan_read_label(timep, sizeof(time_t)), + dfsan_read_origin_of_first_taint(timep, sizeof(time_t)), result, + sizeof(struct tm)); + *ret_label = result_label; + *ret_origin = result_origin; + } else { + *ret_label = 0; + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_getpwuid_r(id_t uid, struct passwd *pwd, + char *buf, size_t buflen, struct passwd **result, + dfsan_label uid_label, dfsan_label pwd_label, + dfsan_label buf_label, dfsan_label buflen_label, + dfsan_label result_label, dfsan_label *ret_label) { + // Store the data in pwd, the strings referenced from pwd in buf, and the + // address of pwd in *result. On failure, NULL is stored in *result. + int ret = getpwuid_r(uid, pwd, buf, buflen, result); + if (ret == 0) { + dfsan_set_label(0, pwd, sizeof(struct passwd)); + dfsan_set_label(0, buf, strlen(buf) + 1); + } + *ret_label = 0; + dfsan_set_label(0, result, sizeof(struct passwd*)); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_getpwuid_r(id_t uid, struct passwd *pwd, char *buf, size_t buflen, + struct passwd **result, dfsan_label uid_label, + dfsan_label pwd_label, dfsan_label buf_label, + dfsan_label buflen_label, dfsan_label result_label, + dfsan_label *ret_label, dfsan_origin uid_origin, + dfsan_origin pwd_origin, dfsan_origin buf_origin, + dfsan_origin buflen_origin, dfsan_origin result_origin, + dfsan_origin *ret_origin) { + return __dfsw_getpwuid_r(uid, pwd, buf, buflen, result, uid_label, pwd_label, + buf_label, buflen_label, result_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_epoll_wait(int epfd, struct epoll_event *events, int maxevents, + int timeout, dfsan_label epfd_label, + dfsan_label events_label, dfsan_label maxevents_label, + dfsan_label timeout_label, dfsan_label *ret_label) { + int ret = epoll_wait(epfd, events, maxevents, timeout); + if (ret > 0) + dfsan_set_label(0, events, ret * sizeof(*events)); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_epoll_wait(int epfd, struct epoll_event *events, int maxevents, + int timeout, dfsan_label epfd_label, + dfsan_label events_label, dfsan_label maxevents_label, + dfsan_label timeout_label, dfsan_label *ret_label, + dfsan_origin epfd_origin, dfsan_origin events_origin, + dfsan_origin maxevents_origin, + dfsan_origin timeout_origin, dfsan_origin *ret_origin) { + return __dfsw_epoll_wait(epfd, events, maxevents, timeout, epfd_label, + events_label, maxevents_label, timeout_label, + ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_poll(struct pollfd *fds, nfds_t nfds, int timeout, + dfsan_label dfs_label, dfsan_label nfds_label, + dfsan_label timeout_label, dfsan_label *ret_label) { + int ret = poll(fds, nfds, timeout); + if (ret >= 0) { + for (; nfds > 0; --nfds) { + dfsan_set_label(0, &fds[nfds - 1].revents, sizeof(fds[nfds - 1].revents)); + } + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_poll(struct pollfd *fds, nfds_t nfds, int timeout, + dfsan_label dfs_label, dfsan_label nfds_label, + dfsan_label timeout_label, dfsan_label *ret_label, + dfsan_origin dfs_origin, dfsan_origin nfds_origin, + dfsan_origin timeout_origin, dfsan_origin *ret_origin) { + return __dfsw_poll(fds, nfds, timeout, dfs_label, nfds_label, timeout_label, + ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_select(int nfds, fd_set *readfds, fd_set *writefds, + fd_set *exceptfds, struct timeval *timeout, + dfsan_label nfds_label, dfsan_label readfds_label, + dfsan_label writefds_label, dfsan_label exceptfds_label, + dfsan_label timeout_label, dfsan_label *ret_label) { + int ret = select(nfds, readfds, writefds, exceptfds, timeout); + // Clear everything (also on error) since their content is either set or + // undefined. + if (readfds) { + dfsan_set_label(0, readfds, sizeof(fd_set)); + } + if (writefds) { + dfsan_set_label(0, writefds, sizeof(fd_set)); + } + if (exceptfds) { + dfsan_set_label(0, exceptfds, sizeof(fd_set)); + } + dfsan_set_label(0, timeout, sizeof(struct timeval)); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_select(int nfds, fd_set *readfds, fd_set *writefds, + fd_set *exceptfds, struct timeval *timeout, + dfsan_label nfds_label, dfsan_label readfds_label, + dfsan_label writefds_label, dfsan_label exceptfds_label, + dfsan_label timeout_label, dfsan_label *ret_label, + dfsan_origin nfds_origin, dfsan_origin readfds_origin, + dfsan_origin writefds_origin, dfsan_origin exceptfds_origin, + dfsan_origin timeout_origin, dfsan_origin *ret_origin) { + return __dfsw_select(nfds, readfds, writefds, exceptfds, timeout, nfds_label, + readfds_label, writefds_label, exceptfds_label, + timeout_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_sched_getaffinity(pid_t pid, size_t cpusetsize, cpu_set_t *mask, + dfsan_label pid_label, + dfsan_label cpusetsize_label, + dfsan_label mask_label, dfsan_label *ret_label) { + int ret = sched_getaffinity(pid, cpusetsize, mask); + if (ret == 0) { + dfsan_set_label(0, mask, cpusetsize); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_sched_getaffinity(pid_t pid, size_t cpusetsize, cpu_set_t *mask, + dfsan_label pid_label, + dfsan_label cpusetsize_label, + dfsan_label mask_label, dfsan_label *ret_label, + dfsan_origin pid_origin, + dfsan_origin cpusetsize_origin, + dfsan_origin mask_origin, + dfsan_origin *ret_origin) { + return __dfsw_sched_getaffinity(pid, cpusetsize, mask, pid_label, + cpusetsize_label, mask_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_sigemptyset(sigset_t *set, dfsan_label set_label, + dfsan_label *ret_label) { + int ret = sigemptyset(set); + dfsan_set_label(0, set, sizeof(sigset_t)); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_sigemptyset(sigset_t *set, dfsan_label set_label, + dfsan_label *ret_label, dfsan_origin set_origin, + dfsan_origin *ret_origin) { + return __dfsw_sigemptyset(set, set_label, ret_label); +} + +class SignalHandlerScope { + public: + SignalHandlerScope() { + if (DFsanThread *t = GetCurrentThread()) + t->EnterSignalHandler(); + } + ~SignalHandlerScope() { + if (DFsanThread *t = GetCurrentThread()) + t->LeaveSignalHandler(); + } +}; + +// Clear DFSan runtime TLS state at the end of a scope. +// +// Implementation must be async-signal-safe and use small data size, because +// instances of this class may live on the signal handler stack. +// +// DFSan uses TLS to pass metadata of arguments and return values. When an +// instrumented function accesses the TLS, if a signal callback happens, and the +// callback calls other instrumented functions with updating the same TLS, the +// TLS is in an inconsistent state after the callback ends. This may cause +// either under-tainting or over-tainting. +// +// The current implementation simply resets TLS at restore. This prevents from +// over-tainting. Although under-tainting may still happen, a taint flow can be +// found eventually if we run a DFSan-instrumented program multiple times. The +// alternative option is saving the entire TLS. However the TLS storage takes +// 2k bytes, and signal calls could be nested. So it does not seem worth. +class ScopedClearThreadLocalState { + public: + ScopedClearThreadLocalState() {} + ~ScopedClearThreadLocalState() { dfsan_clear_thread_local_state(); } +}; + +// SignalSpinLocker::sigactions_mu guarantees atomicity of sigaction() calls. +const int kMaxSignals = 1024; +static atomic_uintptr_t sigactions[kMaxSignals]; + +static void SignalHandler(int signo) { + SignalHandlerScope signal_handler_scope; + ScopedClearThreadLocalState scoped_clear_tls; + + // Clear shadows for all inputs provided by system. This is why DFSan + // instrumentation generates a trampoline function to each function pointer, + // and uses the trampoline to clear shadows. However sigaction does not use + // a function pointer directly, so we have to do this manually. + dfsan_clear_arg_tls(0, sizeof(dfsan_label)); + + typedef void (*signal_cb)(int x); + signal_cb cb = + (signal_cb)atomic_load(&sigactions[signo], memory_order_relaxed); + cb(signo); +} + +static void SignalAction(int signo, siginfo_t *si, void *uc) { + SignalHandlerScope signal_handler_scope; + ScopedClearThreadLocalState scoped_clear_tls; + + // Clear shadows for all inputs provided by system. Similar to SignalHandler. + dfsan_clear_arg_tls(0, 3 * sizeof(dfsan_label)); + dfsan_set_label(0, si, sizeof(*si)); + dfsan_set_label(0, uc, sizeof(ucontext_t)); + + typedef void (*sigaction_cb)(int, siginfo_t *, void *); + sigaction_cb cb = + (sigaction_cb)atomic_load(&sigactions[signo], memory_order_relaxed); + cb(signo, si, uc); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_sigaction(int signum, const struct sigaction *act, + struct sigaction *oldact, dfsan_label signum_label, + dfsan_label act_label, dfsan_label oldact_label, + dfsan_label *ret_label) { + CHECK_LT(signum, kMaxSignals); + SignalSpinLocker lock; + uptr old_cb = atomic_load(&sigactions[signum], memory_order_relaxed); + struct sigaction new_act; + struct sigaction *pnew_act = act ? &new_act : nullptr; + if (act) { + internal_memcpy(pnew_act, act, sizeof(struct sigaction)); + if (pnew_act->sa_flags & SA_SIGINFO) { + uptr cb = (uptr)(pnew_act->sa_sigaction); + if (cb != (uptr)SIG_IGN && cb != (uptr)SIG_DFL) { + atomic_store(&sigactions[signum], cb, memory_order_relaxed); + pnew_act->sa_sigaction = SignalAction; + } + } else { + uptr cb = (uptr)(pnew_act->sa_handler); + if (cb != (uptr)SIG_IGN && cb != (uptr)SIG_DFL) { + atomic_store(&sigactions[signum], cb, memory_order_relaxed); + pnew_act->sa_handler = SignalHandler; + } + } + } + + int ret = sigaction(signum, pnew_act, oldact); + + if (ret == 0 && oldact) { + if (oldact->sa_flags & SA_SIGINFO) { + if (oldact->sa_sigaction == SignalAction) + oldact->sa_sigaction = (decltype(oldact->sa_sigaction))old_cb; + } else { + if (oldact->sa_handler == SignalHandler) + oldact->sa_handler = (decltype(oldact->sa_handler))old_cb; + } + } + + if (oldact) { + dfsan_set_label(0, oldact, sizeof(struct sigaction)); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_sigaction(int signum, const struct sigaction *act, + struct sigaction *oldact, dfsan_label signum_label, + dfsan_label act_label, dfsan_label oldact_label, + dfsan_label *ret_label, dfsan_origin signum_origin, + dfsan_origin act_origin, dfsan_origin oldact_origin, + dfsan_origin *ret_origin) { + return __dfsw_sigaction(signum, act, oldact, signum_label, act_label, + oldact_label, ret_label); +} + +static sighandler_t dfsan_signal(int signum, sighandler_t handler, + dfsan_label *ret_label) { + CHECK_LT(signum, kMaxSignals); + SignalSpinLocker lock; + uptr old_cb = atomic_load(&sigactions[signum], memory_order_relaxed); + if (handler != SIG_IGN && handler != SIG_DFL) { + atomic_store(&sigactions[signum], (uptr)handler, memory_order_relaxed); + handler = &SignalHandler; + } + + sighandler_t ret = signal(signum, handler); + + if (ret == SignalHandler) + ret = (sighandler_t)old_cb; + + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +sighandler_t __dfsw_signal(int signum, + void *(*handler_trampoline)(void *, int, dfsan_label, + dfsan_label *), + sighandler_t handler, dfsan_label signum_label, + dfsan_label handler_label, dfsan_label *ret_label) { + return dfsan_signal(signum, handler, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +sighandler_t __dfso_signal( + int signum, + void *(*handler_trampoline)(void *, int, dfsan_label, dfsan_label *, + dfsan_origin, dfsan_origin *), + sighandler_t handler, dfsan_label signum_label, dfsan_label handler_label, + dfsan_label *ret_label, dfsan_origin signum_origin, + dfsan_origin handler_origin, dfsan_origin *ret_origin) { + return dfsan_signal(signum, handler, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_sigaltstack(const stack_t *ss, stack_t *old_ss, dfsan_label ss_label, + dfsan_label old_ss_label, dfsan_label *ret_label) { + int ret = sigaltstack(ss, old_ss); + if (ret != -1 && old_ss) + dfsan_set_label(0, old_ss, sizeof(*old_ss)); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_sigaltstack(const stack_t *ss, stack_t *old_ss, dfsan_label ss_label, + dfsan_label old_ss_label, dfsan_label *ret_label, + dfsan_origin ss_origin, dfsan_origin old_ss_origin, + dfsan_origin *ret_origin) { + return __dfsw_sigaltstack(ss, old_ss, ss_label, old_ss_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_gettimeofday(struct timeval *tv, struct timezone *tz, + dfsan_label tv_label, dfsan_label tz_label, + dfsan_label *ret_label) { + int ret = gettimeofday(tv, tz); + if (tv) { + dfsan_set_label(0, tv, sizeof(struct timeval)); + } + if (tz) { + dfsan_set_label(0, tz, sizeof(struct timezone)); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_gettimeofday(struct timeval *tv, struct timezone *tz, + dfsan_label tv_label, dfsan_label tz_label, + dfsan_label *ret_label, dfsan_origin tv_origin, + dfsan_origin tz_origin, dfsan_origin *ret_origin) { + return __dfsw_gettimeofday(tv, tz, tv_label, tz_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE void *__dfsw_memchr(void *s, int c, size_t n, + dfsan_label s_label, + dfsan_label c_label, + dfsan_label n_label, + dfsan_label *ret_label) { + void *ret = memchr(s, c, n); + if (flags().strict_data_dependencies) { + *ret_label = ret ? s_label : 0; + } else { + size_t len = + ret ? reinterpret_cast<char *>(ret) - reinterpret_cast<char *>(s) + 1 + : n; + *ret_label = + dfsan_union(dfsan_read_label(s, len), dfsan_union(s_label, c_label)); + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE void *__dfso_memchr( + void *s, int c, size_t n, dfsan_label s_label, dfsan_label c_label, + dfsan_label n_label, dfsan_label *ret_label, dfsan_origin s_origin, + dfsan_origin c_origin, dfsan_origin n_origin, dfsan_origin *ret_origin) { + void *ret = __dfsw_memchr(s, c, n, s_label, c_label, n_label, ret_label); + if (flags().strict_data_dependencies) { + if (ret) + *ret_origin = s_origin; + } else { + size_t len = + ret ? reinterpret_cast<char *>(ret) - reinterpret_cast<char *>(s) + 1 + : n; + dfsan_origin o = dfsan_read_origin_of_first_taint(s, len); + *ret_origin = o ? o : (s_label ? s_origin : c_origin); + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfsw_strrchr(char *s, int c, + dfsan_label s_label, + dfsan_label c_label, + dfsan_label *ret_label) { + char *ret = strrchr(s, c); + if (flags().strict_data_dependencies) { + *ret_label = ret ? s_label : 0; + } else { + *ret_label = + dfsan_union(dfsan_read_label(s, strlen(s) + 1), + dfsan_union(s_label, c_label)); + } + + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfso_strrchr( + char *s, int c, dfsan_label s_label, dfsan_label c_label, + dfsan_label *ret_label, dfsan_origin s_origin, dfsan_origin c_origin, + dfsan_origin *ret_origin) { + char *ret = __dfsw_strrchr(s, c, s_label, c_label, ret_label); + if (flags().strict_data_dependencies) { + if (ret) + *ret_origin = s_origin; + } else { + size_t s_len = strlen(s) + 1; + dfsan_origin o = dfsan_read_origin_of_first_taint(s, s_len); + *ret_origin = o ? o : (s_label ? s_origin : c_origin); + } + + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfsw_strstr(char *haystack, char *needle, + dfsan_label haystack_label, + dfsan_label needle_label, + dfsan_label *ret_label) { + char *ret = strstr(haystack, needle); + if (flags().strict_data_dependencies) { + *ret_label = ret ? haystack_label : 0; + } else { + size_t len = ret ? ret + strlen(needle) - haystack : strlen(haystack) + 1; + *ret_label = + dfsan_union(dfsan_read_label(haystack, len), + dfsan_union(dfsan_read_label(needle, strlen(needle) + 1), + dfsan_union(haystack_label, needle_label))); + } + + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE char *__dfso_strstr(char *haystack, char *needle, + dfsan_label haystack_label, + dfsan_label needle_label, + dfsan_label *ret_label, + dfsan_origin haystack_origin, + dfsan_origin needle_origin, + dfsan_origin *ret_origin) { + char *ret = + __dfsw_strstr(haystack, needle, haystack_label, needle_label, ret_label); + if (flags().strict_data_dependencies) { + if (ret) + *ret_origin = haystack_origin; + } else { + size_t needle_len = strlen(needle); + size_t len = ret ? ret + needle_len - haystack : strlen(haystack) + 1; + dfsan_origin o = dfsan_read_origin_of_first_taint(haystack, len); + if (o) { + *ret_origin = o; + } else { + o = dfsan_read_origin_of_first_taint(needle, needle_len + 1); + *ret_origin = o ? o : (haystack_label ? haystack_origin : needle_origin); + } + } + + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_nanosleep(const struct timespec *req, + struct timespec *rem, + dfsan_label req_label, + dfsan_label rem_label, + dfsan_label *ret_label) { + int ret = nanosleep(req, rem); + *ret_label = 0; + if (ret == -1) { + // Interrupted by a signal, rem is filled with the remaining time. + dfsan_set_label(0, rem, sizeof(struct timespec)); + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_nanosleep( + const struct timespec *req, struct timespec *rem, dfsan_label req_label, + dfsan_label rem_label, dfsan_label *ret_label, dfsan_origin req_origin, + dfsan_origin rem_origin, dfsan_origin *ret_origin) { + return __dfsw_nanosleep(req, rem, req_label, rem_label, ret_label); +} + +static void clear_msghdr_labels(size_t bytes_written, struct msghdr *msg) { + dfsan_set_label(0, msg, sizeof(*msg)); + dfsan_set_label(0, msg->msg_name, msg->msg_namelen); + dfsan_set_label(0, msg->msg_control, msg->msg_controllen); + for (size_t i = 0; bytes_written > 0; ++i) { + assert(i < msg->msg_iovlen); + struct iovec *iov = &msg->msg_iov[i]; + size_t iov_written = + bytes_written < iov->iov_len ? bytes_written : iov->iov_len; + dfsan_set_label(0, iov->iov_base, iov_written); + bytes_written -= iov_written; + } +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_recvmmsg( + int sockfd, struct mmsghdr *msgvec, unsigned int vlen, int flags, + struct timespec *timeout, dfsan_label sockfd_label, + dfsan_label msgvec_label, dfsan_label vlen_label, dfsan_label flags_label, + dfsan_label timeout_label, dfsan_label *ret_label) { + int ret = recvmmsg(sockfd, msgvec, vlen, flags, timeout); + for (int i = 0; i < ret; ++i) { + dfsan_set_label(0, &msgvec[i].msg_len, sizeof(msgvec[i].msg_len)); + clear_msghdr_labels(msgvec[i].msg_len, &msgvec[i].msg_hdr); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_recvmmsg( + int sockfd, struct mmsghdr *msgvec, unsigned int vlen, int flags, + struct timespec *timeout, dfsan_label sockfd_label, + dfsan_label msgvec_label, dfsan_label vlen_label, dfsan_label flags_label, + dfsan_label timeout_label, dfsan_label *ret_label, + dfsan_origin sockfd_origin, dfsan_origin msgvec_origin, + dfsan_origin vlen_origin, dfsan_origin flags_origin, + dfsan_origin timeout_origin, dfsan_origin *ret_origin) { + return __dfsw_recvmmsg(sockfd, msgvec, vlen, flags, timeout, sockfd_label, + msgvec_label, vlen_label, flags_label, timeout_label, + ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE ssize_t __dfsw_recvmsg( + int sockfd, struct msghdr *msg, int flags, dfsan_label sockfd_label, + dfsan_label msg_label, dfsan_label flags_label, dfsan_label *ret_label) { + ssize_t ret = recvmsg(sockfd, msg, flags); + if (ret >= 0) + clear_msghdr_labels(ret, msg); + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE ssize_t __dfso_recvmsg( + int sockfd, struct msghdr *msg, int flags, dfsan_label sockfd_label, + dfsan_label msg_label, dfsan_label flags_label, dfsan_label *ret_label, + dfsan_origin sockfd_origin, dfsan_origin msg_origin, + dfsan_origin flags_origin, dfsan_origin *ret_origin) { + return __dfsw_recvmsg(sockfd, msg, flags, sockfd_label, msg_label, + flags_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE int +__dfsw_socketpair(int domain, int type, int protocol, int sv[2], + dfsan_label domain_label, dfsan_label type_label, + dfsan_label protocol_label, dfsan_label sv_label, + dfsan_label *ret_label) { + int ret = socketpair(domain, type, protocol, sv); + *ret_label = 0; + if (ret == 0) { + dfsan_set_label(0, sv, sizeof(*sv) * 2); + } + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_socketpair( + int domain, int type, int protocol, int sv[2], dfsan_label domain_label, + dfsan_label type_label, dfsan_label protocol_label, dfsan_label sv_label, + dfsan_label *ret_label, dfsan_origin domain_origin, + dfsan_origin type_origin, dfsan_origin protocol_origin, + dfsan_origin sv_origin, dfsan_origin *ret_origin) { + return __dfsw_socketpair(domain, type, protocol, sv, domain_label, type_label, + protocol_label, sv_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_getsockopt( + int sockfd, int level, int optname, void *optval, socklen_t *optlen, + dfsan_label sockfd_label, dfsan_label level_label, + dfsan_label optname_label, dfsan_label optval_label, + dfsan_label optlen_label, dfsan_label *ret_label) { + int ret = getsockopt(sockfd, level, optname, optval, optlen); + if (ret != -1 && optval && optlen) { + dfsan_set_label(0, optlen, sizeof(*optlen)); + dfsan_set_label(0, optval, *optlen); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_getsockopt( + int sockfd, int level, int optname, void *optval, socklen_t *optlen, + dfsan_label sockfd_label, dfsan_label level_label, + dfsan_label optname_label, dfsan_label optval_label, + dfsan_label optlen_label, dfsan_label *ret_label, + dfsan_origin sockfd_origin, dfsan_origin level_origin, + dfsan_origin optname_origin, dfsan_origin optval_origin, + dfsan_origin optlen_origin, dfsan_origin *ret_origin) { + return __dfsw_getsockopt(sockfd, level, optname, optval, optlen, sockfd_label, + level_label, optname_label, optval_label, + optlen_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_getsockname( + int sockfd, struct sockaddr *addr, socklen_t *addrlen, + dfsan_label sockfd_label, dfsan_label addr_label, dfsan_label addrlen_label, + dfsan_label *ret_label) { + socklen_t origlen = addrlen ? *addrlen : 0; + int ret = getsockname(sockfd, addr, addrlen); + if (ret != -1 && addr && addrlen) { + socklen_t written_bytes = origlen < *addrlen ? origlen : *addrlen; + dfsan_set_label(0, addrlen, sizeof(*addrlen)); + dfsan_set_label(0, addr, written_bytes); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_getsockname( + int sockfd, struct sockaddr *addr, socklen_t *addrlen, + dfsan_label sockfd_label, dfsan_label addr_label, dfsan_label addrlen_label, + dfsan_label *ret_label, dfsan_origin sockfd_origin, + dfsan_origin addr_origin, dfsan_origin addrlen_origin, + dfsan_origin *ret_origin) { + return __dfsw_getsockname(sockfd, addr, addrlen, sockfd_label, addr_label, + addrlen_label, ret_label); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfsw_getpeername( + int sockfd, struct sockaddr *addr, socklen_t *addrlen, + dfsan_label sockfd_label, dfsan_label addr_label, dfsan_label addrlen_label, + dfsan_label *ret_label) { + socklen_t origlen = addrlen ? *addrlen : 0; + int ret = getpeername(sockfd, addr, addrlen); + if (ret != -1 && addr && addrlen) { + socklen_t written_bytes = origlen < *addrlen ? origlen : *addrlen; + dfsan_set_label(0, addrlen, sizeof(*addrlen)); + dfsan_set_label(0, addr, written_bytes); + } + *ret_label = 0; + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_getpeername( + int sockfd, struct sockaddr *addr, socklen_t *addrlen, + dfsan_label sockfd_label, dfsan_label addr_label, dfsan_label addrlen_label, + dfsan_label *ret_label, dfsan_origin sockfd_origin, + dfsan_origin addr_origin, dfsan_origin addrlen_origin, + dfsan_origin *ret_origin) { + return __dfsw_getpeername(sockfd, addr, addrlen, sockfd_label, addr_label, + addrlen_label, ret_label); +} + +// Type of the trampoline function passed to the custom version of +// dfsan_set_write_callback. +typedef void (*write_trampoline_t)( + void *callback, + int fd, const void *buf, ssize_t count, + dfsan_label fd_label, dfsan_label buf_label, dfsan_label count_label); + +typedef void (*write_origin_trampoline_t)( + void *callback, int fd, const void *buf, ssize_t count, + dfsan_label fd_label, dfsan_label buf_label, dfsan_label count_label, + dfsan_origin fd_origin, dfsan_origin buf_origin, dfsan_origin count_origin); + +// Calls to dfsan_set_write_callback() set the values in this struct. +// Calls to the custom version of write() read (and invoke) them. +static struct { + write_trampoline_t write_callback_trampoline = nullptr; + void *write_callback = nullptr; +} write_callback_info; + +static struct { + write_origin_trampoline_t write_callback_trampoline = nullptr; + void *write_callback = nullptr; +} write_origin_callback_info; + +SANITIZER_INTERFACE_ATTRIBUTE void +__dfsw_dfsan_set_write_callback( + write_trampoline_t write_callback_trampoline, + void *write_callback, + dfsan_label write_callback_label, + dfsan_label *ret_label) { + write_callback_info.write_callback_trampoline = write_callback_trampoline; + write_callback_info.write_callback = write_callback; +} + +SANITIZER_INTERFACE_ATTRIBUTE void __dfso_dfsan_set_write_callback( + write_origin_trampoline_t write_callback_trampoline, void *write_callback, + dfsan_label write_callback_label, dfsan_label *ret_label, + dfsan_origin write_callback_origin, dfsan_origin *ret_origin) { + write_origin_callback_info.write_callback_trampoline = + write_callback_trampoline; + write_origin_callback_info.write_callback = write_callback; +} + +SANITIZER_INTERFACE_ATTRIBUTE int +__dfsw_write(int fd, const void *buf, size_t count, + dfsan_label fd_label, dfsan_label buf_label, + dfsan_label count_label, dfsan_label *ret_label) { + if (write_callback_info.write_callback) { + write_callback_info.write_callback_trampoline( + write_callback_info.write_callback, + fd, buf, count, + fd_label, buf_label, count_label); + } + + *ret_label = 0; + return write(fd, buf, count); +} + +SANITIZER_INTERFACE_ATTRIBUTE int __dfso_write( + int fd, const void *buf, size_t count, dfsan_label fd_label, + dfsan_label buf_label, dfsan_label count_label, dfsan_label *ret_label, + dfsan_origin fd_origin, dfsan_origin buf_origin, dfsan_origin count_origin, + dfsan_origin *ret_origin) { + if (write_origin_callback_info.write_callback) { + write_origin_callback_info.write_callback_trampoline( + write_origin_callback_info.write_callback, fd, buf, count, fd_label, + buf_label, count_label, fd_origin, buf_origin, count_origin); + } + + *ret_label = 0; + return write(fd, buf, count); +} +} // namespace __dfsan + +// Type used to extract a dfsan_label with va_arg() +typedef int dfsan_label_va; + +// Formats a chunk either a constant string or a single format directive (e.g., +// '%.3f'). +struct Formatter { + Formatter(char *str_, const char *fmt_, size_t size_) + : str(str_), str_off(0), size(size_), fmt_start(fmt_), fmt_cur(fmt_), + width(-1) {} + + int format() { + char *tmp_fmt = build_format_string(); + int retval = + snprintf(str + str_off, str_off < size ? size - str_off : 0, tmp_fmt, + 0 /* used only to avoid warnings */); + free(tmp_fmt); + return retval; + } + + template <typename T> int format(T arg) { + char *tmp_fmt = build_format_string(); + int retval; + if (width >= 0) { + retval = snprintf(str + str_off, str_off < size ? size - str_off : 0, + tmp_fmt, width, arg); + } else { + retval = snprintf(str + str_off, str_off < size ? size - str_off : 0, + tmp_fmt, arg); + } + free(tmp_fmt); + return retval; + } + + char *build_format_string() { + size_t fmt_size = fmt_cur - fmt_start + 1; + char *new_fmt = (char *)malloc(fmt_size + 1); + assert(new_fmt); + internal_memcpy(new_fmt, fmt_start, fmt_size); + new_fmt[fmt_size] = '\0'; + return new_fmt; + } + + char *str_cur() { return str + str_off; } + + size_t num_written_bytes(int retval) { + if (retval < 0) { + return 0; + } + + size_t num_avail = str_off < size ? size - str_off : 0; + if (num_avail == 0) { + return 0; + } + + size_t num_written = retval; + // A return value of {v,}snprintf of size or more means that the output was + // truncated. + if (num_written >= num_avail) { + num_written -= num_avail; + } + + return num_written; + } + + char *str; + size_t str_off; + size_t size; + const char *fmt_start; + const char *fmt_cur; + int width; +}; + +// Formats the input and propagates the input labels to the output. The output +// is stored in 'str'. 'size' bounds the number of output bytes. 'format' and +// 'ap' are the format string and the list of arguments for formatting. Returns +// the return value vsnprintf would return. +// +// The function tokenizes the format string in chunks representing either a +// constant string or a single format directive (e.g., '%.3f') and formats each +// chunk independently into the output string. This approach allows to figure +// out which bytes of the output string depends on which argument and thus to +// propagate labels more precisely. +// +// WARNING: This implementation does not support conversion specifiers with +// positional arguments. +static int format_buffer(char *str, size_t size, const char *fmt, + dfsan_label *va_labels, dfsan_label *ret_label, + dfsan_origin *va_origins, dfsan_origin *ret_origin, + va_list ap) { + Formatter formatter(str, fmt, size); + + while (*formatter.fmt_cur) { + formatter.fmt_start = formatter.fmt_cur; + formatter.width = -1; + int retval = 0; + + if (*formatter.fmt_cur != '%') { + // Ordinary character. Consume all the characters until a '%' or the end + // of the string. + for (; *(formatter.fmt_cur + 1) && *(formatter.fmt_cur + 1) != '%'; + ++formatter.fmt_cur) {} + retval = formatter.format(); + dfsan_set_label(0, formatter.str_cur(), + formatter.num_written_bytes(retval)); + } else { + // Conversion directive. Consume all the characters until a conversion + // specifier or the end of the string. + bool end_fmt = false; + for (; *formatter.fmt_cur && !end_fmt; ) { + switch (*++formatter.fmt_cur) { + case 'd': + case 'i': + case 'o': + case 'u': + case 'x': + case 'X': + switch (*(formatter.fmt_cur - 1)) { + case 'h': + // Also covers the 'hh' case (since the size of the arg is still + // an int). + retval = formatter.format(va_arg(ap, int)); + break; + case 'l': + if (formatter.fmt_cur - formatter.fmt_start >= 2 && + *(formatter.fmt_cur - 2) == 'l') { + retval = formatter.format(va_arg(ap, long long int)); + } else { + retval = formatter.format(va_arg(ap, long int)); + } + break; + case 'q': + retval = formatter.format(va_arg(ap, long long int)); + break; + case 'j': + retval = formatter.format(va_arg(ap, intmax_t)); + break; + case 'z': + case 't': + retval = formatter.format(va_arg(ap, size_t)); + break; + default: + retval = formatter.format(va_arg(ap, int)); + } + if (va_origins == nullptr) + dfsan_set_label(*va_labels++, formatter.str_cur(), + formatter.num_written_bytes(retval)); + else + dfsan_set_label_origin(*va_labels++, *va_origins++, + formatter.str_cur(), + formatter.num_written_bytes(retval)); + end_fmt = true; + break; + + case 'a': + case 'A': + case 'e': + case 'E': + case 'f': + case 'F': + case 'g': + case 'G': + if (*(formatter.fmt_cur - 1) == 'L') { + retval = formatter.format(va_arg(ap, long double)); + } else { + retval = formatter.format(va_arg(ap, double)); + } + if (va_origins == nullptr) + dfsan_set_label(*va_labels++, formatter.str_cur(), + formatter.num_written_bytes(retval)); + else + dfsan_set_label_origin(*va_labels++, *va_origins++, + formatter.str_cur(), + formatter.num_written_bytes(retval)); + end_fmt = true; + break; + + case 'c': + retval = formatter.format(va_arg(ap, int)); + if (va_origins == nullptr) + dfsan_set_label(*va_labels++, formatter.str_cur(), + formatter.num_written_bytes(retval)); + else + dfsan_set_label_origin(*va_labels++, *va_origins++, + formatter.str_cur(), + formatter.num_written_bytes(retval)); + end_fmt = true; + break; + + case 's': { + char *arg = va_arg(ap, char *); + retval = formatter.format(arg); + if (va_origins) { + va_origins++; + dfsan_mem_origin_transfer(formatter.str_cur(), arg, + formatter.num_written_bytes(retval)); + } + va_labels++; + dfsan_mem_shadow_transfer(formatter.str_cur(), arg, + formatter.num_written_bytes(retval)); + end_fmt = true; + break; + } + + case 'p': + retval = formatter.format(va_arg(ap, void *)); + if (va_origins == nullptr) + dfsan_set_label(*va_labels++, formatter.str_cur(), + formatter.num_written_bytes(retval)); + else + dfsan_set_label_origin(*va_labels++, *va_origins++, + formatter.str_cur(), + formatter.num_written_bytes(retval)); + end_fmt = true; + break; + + case 'n': { + int *ptr = va_arg(ap, int *); + *ptr = (int)formatter.str_off; + va_labels++; + if (va_origins) + va_origins++; + dfsan_set_label(0, ptr, sizeof(ptr)); + end_fmt = true; + break; + } + + case '%': + retval = formatter.format(); + dfsan_set_label(0, formatter.str_cur(), + formatter.num_written_bytes(retval)); + end_fmt = true; + break; + + case '*': + formatter.width = va_arg(ap, int); + va_labels++; + if (va_origins) + va_origins++; + break; + + default: + break; + } + } + } + + if (retval < 0) { + return retval; + } + + formatter.fmt_cur++; + formatter.str_off += retval; + } + + *ret_label = 0; + if (ret_origin) + *ret_origin = 0; + + // Number of bytes written in total. + return formatter.str_off; +} + +extern "C" { +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_sprintf(char *str, const char *format, dfsan_label str_label, + dfsan_label format_label, dfsan_label *va_labels, + dfsan_label *ret_label, ...) { + va_list ap; + va_start(ap, ret_label); + int ret = format_buffer(str, ~0ul, format, va_labels, ret_label, nullptr, + nullptr, ap); + va_end(ap); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_sprintf(char *str, const char *format, dfsan_label str_label, + dfsan_label format_label, dfsan_label *va_labels, + dfsan_label *ret_label, dfsan_origin str_origin, + dfsan_origin format_origin, dfsan_origin *va_origins, + dfsan_origin *ret_origin, ...) { + va_list ap; + va_start(ap, ret_origin); + int ret = format_buffer(str, ~0ul, format, va_labels, ret_label, va_origins, + ret_origin, ap); + va_end(ap); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfsw_snprintf(char *str, size_t size, const char *format, + dfsan_label str_label, dfsan_label size_label, + dfsan_label format_label, dfsan_label *va_labels, + dfsan_label *ret_label, ...) { + va_list ap; + va_start(ap, ret_label); + int ret = format_buffer(str, size, format, va_labels, ret_label, nullptr, + nullptr, ap); + va_end(ap); + return ret; +} + +SANITIZER_INTERFACE_ATTRIBUTE +int __dfso_snprintf(char *str, size_t size, const char *format, + dfsan_label str_label, dfsan_label size_label, + dfsan_label format_label, dfsan_label *va_labels, + dfsan_label *ret_label, dfsan_origin str_origin, + dfsan_origin size_origin, dfsan_origin format_origin, + dfsan_origin *va_origins, dfsan_origin *ret_origin, ...) { + va_list ap; + va_start(ap, ret_origin); + int ret = format_buffer(str, size, format, va_labels, ret_label, va_origins, + ret_origin, ap); + va_end(ap); + return ret; +} + +static void BeforeFork() { + StackDepotLockAll(); + GetChainedOriginDepot()->LockAll(); +} + +static void AfterFork() { + GetChainedOriginDepot()->UnlockAll(); + StackDepotUnlockAll(); +} + +SANITIZER_INTERFACE_ATTRIBUTE +pid_t __dfsw_fork(dfsan_label *ret_label) { + pid_t pid = fork(); + *ret_label = 0; + return pid; +} + +SANITIZER_INTERFACE_ATTRIBUTE +pid_t __dfso_fork(dfsan_label *ret_label, dfsan_origin *ret_origin) { + BeforeFork(); + pid_t pid = __dfsw_fork(ret_label); + AfterFork(); + return pid; +} + +// Default empty implementations (weak). Users should redefine them. +SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_cov_trace_pc_guard, u32 *) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_cov_trace_pc_guard_init, u32 *, + u32 *) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_cov_pcs_init, const uptr *beg, + const uptr *end) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __sanitizer_cov_trace_pc_indir, void) {} + +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_cmp, void) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_cmp1, void) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_cmp2, void) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_cmp4, void) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_cmp8, void) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_const_cmp1, + void) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_const_cmp2, + void) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_const_cmp4, + void) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_const_cmp8, + void) {} +SANITIZER_INTERFACE_WEAK_DEF(void, __dfsw___sanitizer_cov_trace_switch, void) {} +} // extern "C" diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_flags.h b/contrib/libs/clang14-rt/lib/dfsan/dfsan_flags.h new file mode 100644 index 0000000000..ec7edf6112 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_flags.h @@ -0,0 +1,32 @@ +//===-- dfsan_flags.h -------------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +// DFSan flags. +//===----------------------------------------------------------------------===// + +#ifndef DFSAN_FLAGS_H +#define DFSAN_FLAGS_H + +namespace __dfsan { + +struct Flags { +#define DFSAN_FLAG(Type, Name, DefaultValue, Description) Type Name; +#include "dfsan_flags.inc" +#undef DFSAN_FLAG + + void SetDefaults(); +}; + +extern Flags flags_data; +inline Flags &flags() { return flags_data; } + +} // namespace __dfsan + +#endif // DFSAN_FLAGS_H diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_flags.inc b/contrib/libs/clang14-rt/lib/dfsan/dfsan_flags.inc new file mode 100644 index 0000000000..67fda0eee4 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_flags.inc @@ -0,0 +1,43 @@ +//===-- dfsan_flags.inc -----------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// DFSan runtime flags. +// +//===----------------------------------------------------------------------===// +#ifndef DFSAN_FLAG +# error "Define DFSAN_FLAG prior to including this file!" +#endif + +// DFSAN_FLAG(Type, Name, DefaultValue, Description) +// See COMMON_FLAG in sanitizer_flags.inc for more details. + +DFSAN_FLAG(bool, warn_unimplemented, false, + "Whether to warn on unimplemented functions.") +DFSAN_FLAG(bool, warn_nonzero_labels, false, + "Whether to warn on unimplemented functions.") +DFSAN_FLAG( + bool, strict_data_dependencies, true, + "Whether to propagate labels only when there is an obvious data dependency" + "(e.g., when comparing strings, ignore the fact that the output of the" + "comparison might be data-dependent on the content of the strings). This" + "applies only to the custom functions defined in 'custom.c'.") +DFSAN_FLAG( + int, origin_history_size, Origin::kMaxDepth, + "The limit of origin chain length. Non-positive values mean unlimited.") +DFSAN_FLAG( + int, origin_history_per_stack_limit, 20000, + "The limit of origin node's references count. " + "Non-positive values mean unlimited.") +DFSAN_FLAG(int, store_context_size, 20, + "The depth limit of origin tracking stack traces.") +DFSAN_FLAG(bool, check_origin_invariant, false, + "Whether to check if the origin invariant holds.") +DFSAN_FLAG(bool, zero_in_malloc, true, + "Whether to zero shadow space of new allocated memory.") +DFSAN_FLAG(bool, zero_in_free, true, + "Whether to zero shadow space of deallocated memory.") diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_interceptors.cpp b/contrib/libs/clang14-rt/lib/dfsan/dfsan_interceptors.cpp new file mode 100644 index 0000000000..d8fb9ea866 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_interceptors.cpp @@ -0,0 +1,220 @@ +//===-- dfsan_interceptors.cpp --------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +// Interceptors for standard library functions. +//===----------------------------------------------------------------------===// + +#include <sys/syscall.h> +#include <unistd.h> + +#include "dfsan/dfsan.h" +#include "dfsan/dfsan_thread.h" +#include "interception/interception.h" +#include "sanitizer_common/sanitizer_allocator_dlsym.h" +#include "sanitizer_common/sanitizer_allocator_interface.h" +#include "sanitizer_common/sanitizer_common.h" +#include "sanitizer_common/sanitizer_errno.h" +#include "sanitizer_common/sanitizer_platform_limits_posix.h" +#include "sanitizer_common/sanitizer_posix.h" +#include "sanitizer_common/sanitizer_tls_get_addr.h" + +using namespace __sanitizer; + +static bool interceptors_initialized; + +struct DlsymAlloc : public DlSymAllocator<DlsymAlloc> { + static bool UseImpl() { return !__dfsan::dfsan_inited; } +}; + +INTERCEPTOR(void *, reallocarray, void *ptr, SIZE_T nmemb, SIZE_T size) { + return __dfsan::dfsan_reallocarray(ptr, nmemb, size); +} + +INTERCEPTOR(void *, __libc_memalign, SIZE_T alignment, SIZE_T size) { + void *ptr = __dfsan::dfsan_memalign(alignment, size); + if (ptr) + DTLS_on_libc_memalign(ptr, size); + return ptr; +} + +INTERCEPTOR(void *, aligned_alloc, SIZE_T alignment, SIZE_T size) { + return __dfsan::dfsan_aligned_alloc(alignment, size); +} + +INTERCEPTOR(void *, calloc, SIZE_T nmemb, SIZE_T size) { + if (DlsymAlloc::Use()) + return DlsymAlloc::Callocate(nmemb, size); + return __dfsan::dfsan_calloc(nmemb, size); +} + +INTERCEPTOR(void *, realloc, void *ptr, SIZE_T size) { + if (DlsymAlloc::Use() || DlsymAlloc::PointerIsMine(ptr)) + return DlsymAlloc::Realloc(ptr, size); + return __dfsan::dfsan_realloc(ptr, size); +} + +INTERCEPTOR(void *, malloc, SIZE_T size) { + if (DlsymAlloc::Use()) + return DlsymAlloc::Allocate(size); + return __dfsan::dfsan_malloc(size); +} + +INTERCEPTOR(void, free, void *ptr) { + if (!ptr) + return; + if (DlsymAlloc::PointerIsMine(ptr)) + return DlsymAlloc::Free(ptr); + return __dfsan::dfsan_deallocate(ptr); +} + +INTERCEPTOR(void, cfree, void *ptr) { + if (!ptr) + return; + if (DlsymAlloc::PointerIsMine(ptr)) + return DlsymAlloc::Free(ptr); + return __dfsan::dfsan_deallocate(ptr); +} + +INTERCEPTOR(int, posix_memalign, void **memptr, SIZE_T alignment, SIZE_T size) { + CHECK_NE(memptr, 0); + int res = __dfsan::dfsan_posix_memalign(memptr, alignment, size); + if (!res) + dfsan_set_label(0, memptr, sizeof(*memptr)); + return res; +} + +INTERCEPTOR(void *, memalign, SIZE_T alignment, SIZE_T size) { + return __dfsan::dfsan_memalign(alignment, size); +} + +INTERCEPTOR(void *, valloc, SIZE_T size) { return __dfsan::dfsan_valloc(size); } + +INTERCEPTOR(void *, pvalloc, SIZE_T size) { + return __dfsan::dfsan_pvalloc(size); +} + +INTERCEPTOR(void, mallinfo, __sanitizer_struct_mallinfo *sret) { + internal_memset(sret, 0, sizeof(*sret)); + dfsan_set_label(0, sret, sizeof(*sret)); +} + +INTERCEPTOR(int, mallopt, int cmd, int value) { return 0; } + +INTERCEPTOR(void, malloc_stats, void) { + // FIXME: implement, but don't call REAL(malloc_stats)! +} + +INTERCEPTOR(uptr, malloc_usable_size, void *ptr) { + return __sanitizer_get_allocated_size(ptr); +} + +#define ENSURE_DFSAN_INITED() \ + do { \ + CHECK(!__dfsan::dfsan_init_is_running); \ + if (!__dfsan::dfsan_inited) { \ + __dfsan::dfsan_init(); \ + } \ + } while (0) + +#define COMMON_INTERCEPTOR_ENTER(func, ...) \ + if (__dfsan::dfsan_init_is_running) \ + return REAL(func)(__VA_ARGS__); \ + ENSURE_DFSAN_INITED(); \ + dfsan_set_label(0, __errno_location(), sizeof(int)); + +INTERCEPTOR(void *, mmap, void *addr, SIZE_T length, int prot, int flags, + int fd, OFF_T offset) { + if (common_flags()->detect_write_exec) + ReportMmapWriteExec(prot, flags); + if (!__dfsan::dfsan_inited) + return (void *)internal_mmap(addr, length, prot, flags, fd, offset); + COMMON_INTERCEPTOR_ENTER(mmap, addr, length, prot, flags, fd, offset); + void *res = REAL(mmap)(addr, length, prot, flags, fd, offset); + if (res != (void *)-1) { + dfsan_set_label(0, res, RoundUpTo(length, GetPageSizeCached())); + } + return res; +} + +INTERCEPTOR(void *, mmap64, void *addr, SIZE_T length, int prot, int flags, + int fd, OFF64_T offset) { + if (common_flags()->detect_write_exec) + ReportMmapWriteExec(prot, flags); + if (!__dfsan::dfsan_inited) + return (void *)internal_mmap(addr, length, prot, flags, fd, offset); + COMMON_INTERCEPTOR_ENTER(mmap64, addr, length, prot, flags, fd, offset); + void *res = REAL(mmap64)(addr, length, prot, flags, fd, offset); + if (res != (void *)-1) { + dfsan_set_label(0, res, RoundUpTo(length, GetPageSizeCached())); + } + return res; +} + +INTERCEPTOR(int, munmap, void *addr, SIZE_T length) { + if (!__dfsan::dfsan_inited) + return internal_munmap(addr, length); + COMMON_INTERCEPTOR_ENTER(munmap, addr, length); + int res = REAL(munmap)(addr, length); + if (res != -1) + dfsan_set_label(0, addr, RoundUpTo(length, GetPageSizeCached())); + return res; +} + +#define COMMON_INTERCEPTOR_GET_TLS_RANGE(begin, end) \ + if (__dfsan::DFsanThread *t = __dfsan::GetCurrentThread()) { \ + *begin = t->tls_begin(); \ + *end = t->tls_end(); \ + } else { \ + *begin = *end = 0; \ + } +#define COMMON_INTERCEPTOR_INITIALIZE_RANGE(ptr, size) \ + dfsan_set_label(0, ptr, size) + +INTERCEPTOR(void *, __tls_get_addr, void *arg) { + COMMON_INTERCEPTOR_ENTER(__tls_get_addr, arg); + void *res = REAL(__tls_get_addr)(arg); + uptr tls_begin, tls_end; + COMMON_INTERCEPTOR_GET_TLS_RANGE(&tls_begin, &tls_end); + DTLS::DTV *dtv = DTLS_on_tls_get_addr(arg, res, tls_begin, tls_end); + if (dtv) { + // New DTLS block has been allocated. + COMMON_INTERCEPTOR_INITIALIZE_RANGE((void *)dtv->beg, dtv->size); + } + return res; +} + +namespace __dfsan { +void initialize_interceptors() { + CHECK(!interceptors_initialized); + + INTERCEPT_FUNCTION(aligned_alloc); + INTERCEPT_FUNCTION(calloc); + INTERCEPT_FUNCTION(cfree); + INTERCEPT_FUNCTION(free); + INTERCEPT_FUNCTION(mallinfo); + INTERCEPT_FUNCTION(malloc); + INTERCEPT_FUNCTION(malloc_stats); + INTERCEPT_FUNCTION(malloc_usable_size); + INTERCEPT_FUNCTION(mallopt); + INTERCEPT_FUNCTION(memalign); + INTERCEPT_FUNCTION(mmap); + INTERCEPT_FUNCTION(mmap64); + INTERCEPT_FUNCTION(munmap); + INTERCEPT_FUNCTION(posix_memalign); + INTERCEPT_FUNCTION(pvalloc); + INTERCEPT_FUNCTION(realloc); + INTERCEPT_FUNCTION(reallocarray); + INTERCEPT_FUNCTION(valloc); + INTERCEPT_FUNCTION(__tls_get_addr); + INTERCEPT_FUNCTION(__libc_memalign); + + interceptors_initialized = true; +} +} // namespace __dfsan diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_new_delete.cpp b/contrib/libs/clang14-rt/lib/dfsan/dfsan_new_delete.cpp new file mode 100644 index 0000000000..7ac906e810 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_new_delete.cpp @@ -0,0 +1,124 @@ +//===-- dfsan_new_delete.cpp ----------------------------------------------===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataflowSanitizer. +// +// Interceptors for operators new and delete. +//===----------------------------------------------------------------------===// + +#include <stddef.h> + +#include "dfsan.h" +#include "interception/interception.h" +#include "sanitizer_common/sanitizer_allocator.h" +#include "sanitizer_common/sanitizer_allocator_report.h" + +using namespace __dfsan; + +// Fake std::nothrow_t and std::align_val_t to avoid including <new>. +namespace std { +struct nothrow_t {}; +enum class align_val_t : size_t {}; +} // namespace std + +// TODO(alekseys): throw std::bad_alloc instead of dying on OOM. +#define OPERATOR_NEW_BODY(nothrow) \ + void *res = dfsan_malloc(size); \ + if (!nothrow && UNLIKELY(!res)) { \ + BufferedStackTrace stack; \ + ReportOutOfMemory(size, &stack); \ + } \ + return res +#define OPERATOR_NEW_BODY_ALIGN(nothrow) \ + void *res = dfsan_memalign((uptr)align, size); \ + if (!nothrow && UNLIKELY(!res)) { \ + BufferedStackTrace stack; \ + ReportOutOfMemory(size, &stack); \ + } \ + return res; + +INTERCEPTOR_ATTRIBUTE +void *operator new(size_t size) { OPERATOR_NEW_BODY(false /*nothrow*/); } +INTERCEPTOR_ATTRIBUTE +void *operator new[](size_t size) { OPERATOR_NEW_BODY(false /*nothrow*/); } +INTERCEPTOR_ATTRIBUTE +void *operator new(size_t size, std::nothrow_t const &) { + OPERATOR_NEW_BODY(true /*nothrow*/); +} +INTERCEPTOR_ATTRIBUTE +void *operator new[](size_t size, std::nothrow_t const &) { + OPERATOR_NEW_BODY(true /*nothrow*/); +} +INTERCEPTOR_ATTRIBUTE +void *operator new(size_t size, std::align_val_t align) { + OPERATOR_NEW_BODY_ALIGN(false /*nothrow*/); +} +INTERCEPTOR_ATTRIBUTE +void *operator new[](size_t size, std::align_val_t align) { + OPERATOR_NEW_BODY_ALIGN(false /*nothrow*/); +} +INTERCEPTOR_ATTRIBUTE +void *operator new(size_t size, std::align_val_t align, + std::nothrow_t const &) { + OPERATOR_NEW_BODY_ALIGN(true /*nothrow*/); +} +INTERCEPTOR_ATTRIBUTE +void *operator new[](size_t size, std::align_val_t align, + std::nothrow_t const &) { + OPERATOR_NEW_BODY_ALIGN(true /*nothrow*/); +} + +#define OPERATOR_DELETE_BODY \ + if (ptr) \ + dfsan_deallocate(ptr) + +INTERCEPTOR_ATTRIBUTE +void operator delete(void *ptr)NOEXCEPT { OPERATOR_DELETE_BODY; } +INTERCEPTOR_ATTRIBUTE +void operator delete[](void *ptr) NOEXCEPT { OPERATOR_DELETE_BODY; } +INTERCEPTOR_ATTRIBUTE +void operator delete(void *ptr, std::nothrow_t const &) { + OPERATOR_DELETE_BODY; +} +INTERCEPTOR_ATTRIBUTE +void operator delete[](void *ptr, std::nothrow_t const &) { + OPERATOR_DELETE_BODY; +} +INTERCEPTOR_ATTRIBUTE +void operator delete(void *ptr, size_t size)NOEXCEPT { OPERATOR_DELETE_BODY; } +INTERCEPTOR_ATTRIBUTE +void operator delete[](void *ptr, size_t size) NOEXCEPT { + OPERATOR_DELETE_BODY; +} +INTERCEPTOR_ATTRIBUTE +void operator delete(void *ptr, std::align_val_t align)NOEXCEPT { + OPERATOR_DELETE_BODY; +} +INTERCEPTOR_ATTRIBUTE +void operator delete[](void *ptr, std::align_val_t align) NOEXCEPT { + OPERATOR_DELETE_BODY; +} +INTERCEPTOR_ATTRIBUTE +void operator delete(void *ptr, std::align_val_t align, + std::nothrow_t const &) { + OPERATOR_DELETE_BODY; +} +INTERCEPTOR_ATTRIBUTE +void operator delete[](void *ptr, std::align_val_t align, + std::nothrow_t const &) { + OPERATOR_DELETE_BODY; +} +INTERCEPTOR_ATTRIBUTE +void operator delete(void *ptr, size_t size, std::align_val_t align)NOEXCEPT { + OPERATOR_DELETE_BODY; +} +INTERCEPTOR_ATTRIBUTE +void operator delete[](void *ptr, size_t size, + std::align_val_t align) NOEXCEPT { + OPERATOR_DELETE_BODY; +} diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_origin.h b/contrib/libs/clang14-rt/lib/dfsan/dfsan_origin.h new file mode 100644 index 0000000000..89fd7f9959 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_origin.h @@ -0,0 +1,127 @@ +//===-- dfsan_origin.h ----------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +// Origin id utils. +//===----------------------------------------------------------------------===// + +#ifndef DFSAN_ORIGIN_H +#define DFSAN_ORIGIN_H + +#include "dfsan_chained_origin_depot.h" +#include "dfsan_flags.h" +#include "sanitizer_common/sanitizer_stackdepot.h" + +namespace __dfsan { + +// Origin handling. +// +// Origin is a 32-bit identifier that is attached to any taint value in the +// program and describes how this memory came to be tainted. +// +// Chained origin id is like: +// zzzz xxxx xxxx xxxx +// +// Chained origin id describes an event of storing a taint value to +// memory. The xxx part is a value of ChainedOriginDepot, which is a mapping of +// (stack_id, prev_id) -> id, where +// * stack_id describes the event. +// StackDepot keeps a mapping between those and corresponding stack traces. +// * prev_id is another origin id that describes the earlier part of the +// taint value history. 0 prev_id indicates the start of a chain. +// Following a chain of prev_id provides the full recorded history of a taint +// value. +// +// This, effectively, defines a forest where nodes are points in value history +// marked with origin ids, and edges are events that are marked with stack_id. +// +// The "zzzz" bits of chained origin id are used to store the length of the +// origin chain. + +class Origin { + public: + static bool isValidId(u32 id) { return id != 0; } + + u32 raw_id() const { return raw_id_; } + + bool isChainedOrigin() const { return Origin::isValidId(raw_id_); } + + u32 getChainedId() const { + CHECK(Origin::isValidId(raw_id_)); + return raw_id_ & kChainedIdMask; + } + + // Returns the next origin in the chain and the current stack trace. + // + // It scans a partition of StackDepot linearly, and is used only by origin + // tracking report. + Origin getNextChainedOrigin(StackTrace *stack) const { + CHECK(Origin::isValidId(raw_id_)); + u32 prev_id; + u32 stack_id = GetChainedOriginDepot()->Get(getChainedId(), &prev_id); + if (stack) + *stack = StackDepotGet(stack_id); + return Origin(prev_id); + } + + static Origin CreateChainedOrigin(Origin prev, StackTrace *stack) { + int depth = prev.isChainedOrigin() ? prev.depth() : -1; + // depth is the length of the chain minus 1. + // origin_history_size of 0 means unlimited depth. + if (flags().origin_history_size > 0) { + ++depth; + if (depth >= flags().origin_history_size || depth > kMaxDepth) + return prev; + } + + StackDepotHandle h = StackDepotPut_WithHandle(*stack); + if (!h.valid()) + return prev; + + if (flags().origin_history_per_stack_limit > 0) { + int use_count = h.use_count(); + if (use_count > flags().origin_history_per_stack_limit) + return prev; + } + + u32 chained_id; + bool inserted = + GetChainedOriginDepot()->Put(h.id(), prev.raw_id(), &chained_id); + CHECK((chained_id & kChainedIdMask) == chained_id); + + if (inserted && flags().origin_history_per_stack_limit > 0) + h.inc_use_count_unsafe(); + + return Origin((depth << kDepthShift) | chained_id); + } + + static Origin FromRawId(u32 id) { return Origin(id); } + + private: + static const int kDepthBits = 4; + static const int kDepthShift = 32 - kDepthBits; + + static const u32 kChainedIdMask = ((u32)-1) >> kDepthBits; + + u32 raw_id_; + + explicit Origin(u32 raw_id) : raw_id_(raw_id) {} + + int depth() const { + CHECK(isChainedOrigin()); + return (raw_id_ >> kDepthShift) & ((1 << kDepthBits) - 1); + } + + public: + static const int kMaxDepth = (1 << kDepthBits) - 1; +}; + +} // namespace __dfsan + +#endif // DFSAN_ORIGIN_H diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_platform.h b/contrib/libs/clang14-rt/lib/dfsan/dfsan_platform.h new file mode 100644 index 0000000000..9b4333ee99 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_platform.h @@ -0,0 +1,88 @@ +//===-- dfsan_platform.h ----------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +// Platform specific information for DFSan. +//===----------------------------------------------------------------------===// + +#ifndef DFSAN_PLATFORM_H +#define DFSAN_PLATFORM_H + +#include "sanitizer_common/sanitizer_common.h" +#include "sanitizer_common/sanitizer_platform.h" + +namespace __dfsan { + +using __sanitizer::uptr; + +// TODO: The memory mapping code to setup a 1:1 shadow is based on msan. +// Consider refactoring these into a shared implementation. + +struct MappingDesc { + uptr start; + uptr end; + enum Type { INVALID, APP, SHADOW, ORIGIN } type; + const char *name; +}; + +#if SANITIZER_LINUX && SANITIZER_WORDSIZE == 64 + +// All of the following configurations are supported. +// ASLR disabled: main executable and DSOs at 0x555550000000 +// PIE and ASLR: main executable and DSOs at 0x7f0000000000 +// non-PIE: main executable below 0x100000000, DSOs at 0x7f0000000000 +// Heap at 0x700000000000. +const MappingDesc kMemoryLayout[] = { + {0x000000000000ULL, 0x010000000000ULL, MappingDesc::APP, "app-1"}, + {0x010000000000ULL, 0x100000000000ULL, MappingDesc::SHADOW, "shadow-2"}, + {0x100000000000ULL, 0x110000000000ULL, MappingDesc::INVALID, "invalid"}, + {0x110000000000ULL, 0x200000000000ULL, MappingDesc::ORIGIN, "origin-2"}, + {0x200000000000ULL, 0x300000000000ULL, MappingDesc::SHADOW, "shadow-3"}, + {0x300000000000ULL, 0x400000000000ULL, MappingDesc::ORIGIN, "origin-3"}, + {0x400000000000ULL, 0x500000000000ULL, MappingDesc::INVALID, "invalid"}, + {0x500000000000ULL, 0x510000000000ULL, MappingDesc::SHADOW, "shadow-1"}, + {0x510000000000ULL, 0x600000000000ULL, MappingDesc::APP, "app-2"}, + {0x600000000000ULL, 0x610000000000ULL, MappingDesc::ORIGIN, "origin-1"}, + {0x610000000000ULL, 0x700000000000ULL, MappingDesc::INVALID, "invalid"}, + {0x700000000000ULL, 0x800000000000ULL, MappingDesc::APP, "app-3"}}; +# define MEM_TO_SHADOW(mem) (((uptr)(mem)) ^ 0x500000000000ULL) +# define SHADOW_TO_ORIGIN(mem) (((uptr)(mem)) + 0x100000000000ULL) + +#else +# error "Unsupported platform" +#endif + +const uptr kMemoryLayoutSize = sizeof(kMemoryLayout) / sizeof(kMemoryLayout[0]); + +#define MEM_TO_ORIGIN(mem) (SHADOW_TO_ORIGIN(MEM_TO_SHADOW((mem)))) + +#ifndef __clang__ +__attribute__((optimize("unroll-loops"))) +#endif +inline bool +addr_is_type(uptr addr, MappingDesc::Type mapping_type) { +// It is critical for performance that this loop is unrolled (because then it is +// simplified into just a few constant comparisons). +#ifdef __clang__ +# pragma unroll +#endif + for (unsigned i = 0; i < kMemoryLayoutSize; ++i) + if (kMemoryLayout[i].type == mapping_type && + addr >= kMemoryLayout[i].start && addr < kMemoryLayout[i].end) + return true; + return false; +} + +#define MEM_IS_APP(mem) addr_is_type((uptr)(mem), MappingDesc::APP) +#define MEM_IS_SHADOW(mem) addr_is_type((uptr)(mem), MappingDesc::SHADOW) +#define MEM_IS_ORIGIN(mem) addr_is_type((uptr)(mem), MappingDesc::ORIGIN) + +} // namespace __dfsan + +#endif diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_thread.cpp b/contrib/libs/clang14-rt/lib/dfsan/dfsan_thread.cpp new file mode 100644 index 0000000000..df7e4d9b74 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_thread.cpp @@ -0,0 +1,144 @@ +#include "dfsan_thread.h" + +#include <pthread.h> + +#include "dfsan.h" +#include "sanitizer_common/sanitizer_tls_get_addr.h" + +namespace __dfsan { + +DFsanThread *DFsanThread::Create(void *start_routine_trampoline, + thread_callback_t start_routine, void *arg, + bool track_origins) { + uptr PageSize = GetPageSizeCached(); + uptr size = RoundUpTo(sizeof(DFsanThread), PageSize); + DFsanThread *thread = (DFsanThread *)MmapOrDie(size, __func__); + thread->start_routine_trampoline_ = start_routine_trampoline; + thread->start_routine_ = start_routine; + thread->arg_ = arg; + thread->track_origins_ = track_origins; + thread->destructor_iterations_ = GetPthreadDestructorIterations(); + + return thread; +} + +void DFsanThread::SetThreadStackAndTls() { + uptr tls_size = 0; + uptr stack_size = 0; + GetThreadStackAndTls(IsMainThread(), &stack_.bottom, &stack_size, &tls_begin_, + &tls_size); + stack_.top = stack_.bottom + stack_size; + tls_end_ = tls_begin_ + tls_size; + + int local; + CHECK(AddrIsInStack((uptr)&local)); +} + +void DFsanThread::ClearShadowForThreadStackAndTLS() { + dfsan_set_label(0, (void *)stack_.bottom, stack_.top - stack_.bottom); + if (tls_begin_ != tls_end_) + dfsan_set_label(0, (void *)tls_begin_, tls_end_ - tls_begin_); + DTLS *dtls = DTLS_Get(); + CHECK_NE(dtls, 0); + ForEachDVT(dtls, [](const DTLS::DTV &dtv, int id) { + dfsan_set_label(0, (void *)(dtv.beg), dtv.size); + }); +} + +void DFsanThread::Init() { + SetThreadStackAndTls(); + ClearShadowForThreadStackAndTLS(); +} + +void DFsanThread::TSDDtor(void *tsd) { + DFsanThread *t = (DFsanThread *)tsd; + t->Destroy(); +} + +void DFsanThread::Destroy() { + malloc_storage().CommitBack(); + // We also clear the shadow on thread destruction because + // some code may still be executing in later TSD destructors + // and we don't want it to have any poisoned stack. + ClearShadowForThreadStackAndTLS(); + uptr size = RoundUpTo(sizeof(DFsanThread), GetPageSizeCached()); + UnmapOrDie(this, size); + DTLS_Destroy(); +} + +thread_return_t DFsanThread::ThreadStart() { + if (!start_routine_) { + // start_routine_ == 0 if we're on the main thread or on one of the + // OS X libdispatch worker threads. But nobody is supposed to call + // ThreadStart() for the worker threads. + return 0; + } + + CHECK(start_routine_trampoline_); + + typedef void *(*thread_callback_trampoline_t)(void *, void *, dfsan_label, + dfsan_label *); + typedef void *(*thread_callback_origin_trampoline_t)( + void *, void *, dfsan_label, dfsan_label *, dfsan_origin, dfsan_origin *); + + dfsan_label ret_label; + if (!track_origins_) + return ((thread_callback_trampoline_t) + start_routine_trampoline_)((void *)start_routine_, arg_, 0, + &ret_label); + + dfsan_origin ret_origin; + return ((thread_callback_origin_trampoline_t) + start_routine_trampoline_)((void *)start_routine_, arg_, 0, + &ret_label, 0, &ret_origin); +} + +DFsanThread::StackBounds DFsanThread::GetStackBounds() const { + return {stack_.bottom, stack_.top}; +} + +uptr DFsanThread::stack_top() { return GetStackBounds().top; } + +uptr DFsanThread::stack_bottom() { return GetStackBounds().bottom; } + +bool DFsanThread::AddrIsInStack(uptr addr) { + const auto bounds = GetStackBounds(); + return addr >= bounds.bottom && addr < bounds.top; +} + +static pthread_key_t tsd_key; +static bool tsd_key_inited = false; + +void DFsanTSDInit(void (*destructor)(void *tsd)) { + CHECK(!tsd_key_inited); + tsd_key_inited = true; + CHECK_EQ(0, pthread_key_create(&tsd_key, destructor)); +} + +static THREADLOCAL DFsanThread *dfsan_current_thread; + +DFsanThread *GetCurrentThread() { return dfsan_current_thread; } + +void SetCurrentThread(DFsanThread *t) { + // Make sure we do not reset the current DFsanThread. + CHECK_EQ(0, dfsan_current_thread); + dfsan_current_thread = t; + // Make sure that DFsanTSDDtor gets called at the end. + CHECK(tsd_key_inited); + pthread_setspecific(tsd_key, t); +} + +void DFsanTSDDtor(void *tsd) { + DFsanThread *t = (DFsanThread *)tsd; + if (t->destructor_iterations_ > 1) { + t->destructor_iterations_--; + CHECK_EQ(0, pthread_setspecific(tsd_key, tsd)); + return; + } + dfsan_current_thread = nullptr; + // Make sure that signal handler can not see a stale current thread pointer. + atomic_signal_fence(memory_order_seq_cst); + DFsanThread::TSDDtor(tsd); +} + +} // namespace __dfsan diff --git a/contrib/libs/clang14-rt/lib/dfsan/dfsan_thread.h b/contrib/libs/clang14-rt/lib/dfsan/dfsan_thread.h new file mode 100644 index 0000000000..1c33a18549 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/dfsan_thread.h @@ -0,0 +1,84 @@ +//===-- dfsan_thread.h ------------------------------------------*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception +// +//===----------------------------------------------------------------------===// +// +// This file is a part of DataFlowSanitizer. +// +//===----------------------------------------------------------------------===// + +#ifndef DFSAN_THREAD_H +#define DFSAN_THREAD_H + +#include "dfsan_allocator.h" +#include "sanitizer_common/sanitizer_common.h" +#include "sanitizer_common/sanitizer_posix.h" + +namespace __dfsan { + +class DFsanThread { + public: + // NOTE: There is no DFsanThread constructor. It is allocated + // via mmap() and *must* be valid in zero-initialized state. + + static DFsanThread *Create(void *start_routine_trampoline, + thread_callback_t start_routine, void *arg, + bool track_origins = false); + static void TSDDtor(void *tsd); + void Destroy(); + + void Init(); // Should be called from the thread itself. + thread_return_t ThreadStart(); + + uptr stack_top(); + uptr stack_bottom(); + uptr tls_begin() { return tls_begin_; } + uptr tls_end() { return tls_end_; } + bool IsMainThread() { return start_routine_ == nullptr; } + + bool InSignalHandler() { return in_signal_handler_; } + void EnterSignalHandler() { in_signal_handler_++; } + void LeaveSignalHandler() { in_signal_handler_--; } + + DFsanThreadLocalMallocStorage &malloc_storage() { return malloc_storage_; } + + int destructor_iterations_; + __sanitizer_sigset_t starting_sigset_; + + private: + void SetThreadStackAndTls(); + void ClearShadowForThreadStackAndTLS(); + struct StackBounds { + uptr bottom; + uptr top; + }; + StackBounds GetStackBounds() const; + + bool AddrIsInStack(uptr addr); + + void *start_routine_trampoline_; + thread_callback_t start_routine_; + void *arg_; + bool track_origins_; + + StackBounds stack_; + + uptr tls_begin_; + uptr tls_end_; + + unsigned in_signal_handler_; + + DFsanThreadLocalMallocStorage malloc_storage_; +}; + +DFsanThread *GetCurrentThread(); +void SetCurrentThread(DFsanThread *t); +void DFsanTSDInit(void (*destructor)(void *tsd)); +void DFsanTSDDtor(void *tsd); + +} // namespace __dfsan + +#endif // DFSAN_THREAD_H diff --git a/contrib/libs/clang14-rt/lib/dfsan/ya.make b/contrib/libs/clang14-rt/lib/dfsan/ya.make new file mode 100644 index 0000000000..33ed989b25 --- /dev/null +++ b/contrib/libs/clang14-rt/lib/dfsan/ya.make @@ -0,0 +1,123 @@ +# Generated by devtools/yamaker. + +INCLUDE(${ARCADIA_ROOT}/build/platform/clang/arch.cmake) + +LIBRARY(clang_rt.dfsan${CLANG_RT_SUFFIX}) + +LICENSE( + Apache-2.0 AND + Apache-2.0 WITH LLVM-exception AND + MIT AND + NCSA +) + +LICENSE_TEXTS(.yandex_meta/licenses.list.txt) + +OWNER(g:cpp-contrib) + +ADDINCL( + contrib/libs/clang14-rt/lib +) + +NO_COMPILER_WARNINGS() + +NO_UTIL() + +NO_SANITIZE() + +CFLAGS( + -DHAVE_RPC_XDR_H=0 + -fPIE + -fcommon + -ffreestanding + -fno-builtin + -fno-exceptions + -fno-lto + -fno-rtti + -fno-stack-protector + -fomit-frame-pointer + -funwind-tables + -fvisibility=hidden +) + +SRCDIR(contrib/libs/clang14-rt/lib) + +SRCS( + dfsan/dfsan.cpp + dfsan/dfsan_allocator.cpp + dfsan/dfsan_chained_origin_depot.cpp + dfsan/dfsan_custom.cpp + dfsan/dfsan_interceptors.cpp + dfsan/dfsan_new_delete.cpp + dfsan/dfsan_thread.cpp + interception/interception_linux.cpp + interception/interception_mac.cpp + interception/interception_type_test.cpp + interception/interception_win.cpp + sanitizer_common/sanitizer_allocator.cpp + sanitizer_common/sanitizer_allocator_checks.cpp + sanitizer_common/sanitizer_allocator_report.cpp + sanitizer_common/sanitizer_chained_origin_depot.cpp + sanitizer_common/sanitizer_common.cpp + sanitizer_common/sanitizer_common_libcdep.cpp + sanitizer_common/sanitizer_deadlock_detector1.cpp + sanitizer_common/sanitizer_deadlock_detector2.cpp + sanitizer_common/sanitizer_errno.cpp + sanitizer_common/sanitizer_file.cpp + sanitizer_common/sanitizer_flag_parser.cpp + sanitizer_common/sanitizer_flags.cpp + sanitizer_common/sanitizer_fuchsia.cpp + sanitizer_common/sanitizer_libc.cpp + sanitizer_common/sanitizer_libignore.cpp + sanitizer_common/sanitizer_linux.cpp + sanitizer_common/sanitizer_linux_libcdep.cpp + sanitizer_common/sanitizer_linux_s390.cpp + sanitizer_common/sanitizer_mac.cpp + sanitizer_common/sanitizer_mac_libcdep.cpp + sanitizer_common/sanitizer_mutex.cpp + sanitizer_common/sanitizer_netbsd.cpp + sanitizer_common/sanitizer_platform_limits_freebsd.cpp + sanitizer_common/sanitizer_platform_limits_linux.cpp + sanitizer_common/sanitizer_platform_limits_netbsd.cpp + sanitizer_common/sanitizer_platform_limits_posix.cpp + sanitizer_common/sanitizer_platform_limits_solaris.cpp + sanitizer_common/sanitizer_posix.cpp + sanitizer_common/sanitizer_posix_libcdep.cpp + sanitizer_common/sanitizer_printf.cpp + sanitizer_common/sanitizer_procmaps_bsd.cpp + sanitizer_common/sanitizer_procmaps_common.cpp + sanitizer_common/sanitizer_procmaps_fuchsia.cpp + sanitizer_common/sanitizer_procmaps_linux.cpp + sanitizer_common/sanitizer_procmaps_mac.cpp + sanitizer_common/sanitizer_procmaps_solaris.cpp + sanitizer_common/sanitizer_solaris.cpp + sanitizer_common/sanitizer_stack_store.cpp + sanitizer_common/sanitizer_stackdepot.cpp + sanitizer_common/sanitizer_stacktrace.cpp + sanitizer_common/sanitizer_stacktrace_libcdep.cpp + sanitizer_common/sanitizer_stacktrace_printer.cpp + sanitizer_common/sanitizer_stacktrace_sparc.cpp + sanitizer_common/sanitizer_stoptheworld_fuchsia.cpp + sanitizer_common/sanitizer_stoptheworld_linux_libcdep.cpp + sanitizer_common/sanitizer_stoptheworld_mac.cpp + sanitizer_common/sanitizer_stoptheworld_netbsd_libcdep.cpp + sanitizer_common/sanitizer_stoptheworld_win.cpp + sanitizer_common/sanitizer_suppressions.cpp + sanitizer_common/sanitizer_symbolizer.cpp + sanitizer_common/sanitizer_symbolizer_libbacktrace.cpp + sanitizer_common/sanitizer_symbolizer_libcdep.cpp + sanitizer_common/sanitizer_symbolizer_mac.cpp + sanitizer_common/sanitizer_symbolizer_markup.cpp + sanitizer_common/sanitizer_symbolizer_posix_libcdep.cpp + sanitizer_common/sanitizer_symbolizer_report.cpp + sanitizer_common/sanitizer_symbolizer_win.cpp + sanitizer_common/sanitizer_termination.cpp + sanitizer_common/sanitizer_thread_registry.cpp + sanitizer_common/sanitizer_tls_get_addr.cpp + sanitizer_common/sanitizer_type_traits.cpp + sanitizer_common/sanitizer_unwind_linux_libcdep.cpp + sanitizer_common/sanitizer_unwind_win.cpp + sanitizer_common/sanitizer_win.cpp +) + +END() |