Skip to content

feat: unstable SHA256 support#1206

Open
weihanglo wants to merge 8 commits into
rust-lang:mainfrom
weihanglo:sha256-git2
Open

feat: unstable SHA256 support#1206
weihanglo wants to merge 8 commits into
rust-lang:mainfrom
weihanglo:sha256-git2

Conversation

@weihanglo
Copy link
Copy Markdown
Member

@weihanglo weihanglo commented Jan 12, 2026

This adds an unstable-sha256 Cargo feature, as a follow-up of #1201.
Also adds smoke tests for affected operations/types.

Part of #1090

Design

Introduce ObjectFormat accessors and *_ext API variants unconditionally,
The actual SHA256 wiring are gated behind the unstable-sha256 Cargo feature
in order to keep public API additive.

The *_ext suffix follows libgit2's convention for extended function variants.
At next major version, the original methods may absorb the format parameter.
The actual change will leave for future.

Insta-stable

  • NEW ObjectFormat enum (#[non_exhaustive] when unstable-sha256 is not enabled)
  • NEW Oid::ZERO_SHA1 const
  • NEW Oid::object_format() to access underlying OID type
  • NEW Oid::raw_bytes() to return the full underlying byte buffer
  • NEW Repository::object_format() to get hash algo on a repo
  • NEW Remote::object_format() to get hash algo on a remote
  • NEW RepositoryInitOptions::object_format() setter
  • NEW Oid::from_str_ext / Oid::hash_object_ext / Oid::hash_file_ext
  • NEW Diff::from_buffer_ext
  • NEW Index::new_ext / Index::open_ext
  • NEW Indexer::new_ext
  • NEW Odb::new_ext
  • DEPRECATED Oid::zero() in favor of Oid::ZERO_SHA1

Behind unstable-sha256

  • NEW ObjectFormat::Sha256 enum variant
  • NEW Oid::ZERO_SHA256 const
  • CHANGED All _ext methods dispatch to SHA256-aware FFI paths
  • CHANGED Oid::as_bytes returns logical bytes (SHA1 -> 20; SHA256 -> 32)
  • CHANGED Oid::from_bytes accepts 20 or 32 bytes

Comment thread src/oid.rs Outdated
@lorenzleutgeb
Copy link
Copy Markdown

lorenzleutgeb commented Apr 25, 2026

I believe to have found something that you missed in this PR. Note that git2::util::zeroed_raw_oid does not take any arguments:

git2-rs/src/util.rs

Lines 271 to 275 in e6919b7

/// Creates a zeroed git_oid structure.
#[inline]
pub(crate) fn zeroed_raw_oid() -> raw::git_oid {
unsafe { std::mem::zeroed() }
}

I believe that this means that the kind: c_uchar (where c_uchar is defined as u8) member of the returned libgit2_sys::git_oid will have the value 0u8.

The function git2::util::zeroed_raw_oid is called by git2::Oid::zero, which also takes no arguments. If the function git2::Oid::object_format is called on the result of git2::Oid::zero, the call

unsafe { Binding::from_raw(self.raw.kind as raw::git_oid_t) }

will hit the panic! at

_ => panic!("Unknown git oid type"),

This is because 0u8 as u32 (which we got from std::mem::zeroed) is neither GIT_OID_SHA1 = 1, nor GIT_OID_SHA256 = 2.

I discovered this experimenting in code that is downstream of git2, building against this PR. In one of my tests I did essentially git2::Oid::zero().object_format().

Then I started to think about how much unsafe really is necessary here. Maybe zeroed_raw_oid could look like

#[inline]
pub(crate) fn zeroed_raw_oid(format: ObjectFormat) -> raw::git_oid {
  raw::git_oid {
    kind: format.raw(),
    id: [0; raw::GIT_OID_MAX_SIZE],
  }
}

In fact one might even ask why not to have

impl Oid {
  pub const ZERO_SHA1: Self = Self {
    raw: raw::git_oid {
      kind: raw::GIT_OID_SHA1,
      id: [0; raw::GIT_OID_MAX_SIZE],
    }
  };
  
  pub const ZERO_SHA256: Self = Self {
    raw: raw::git_oid {
      kind: raw::GIT_OID_SHA256,
      id: [0; raw::GIT_OID_MAX_SIZE],
    }
  };
  
  // NOTE: Maybe call this `zero_with_object_format` to not break `zero`.
  pub fn zero(format: ObjectFormat) -> Self {
    // TODO: Match on `format` and return `Oid::ZERO_SHA1` or `Oid::ZERO_SHA256`.
  }
}

This way, users that know at compile time which kind of zero they want can just use the constant, while those that decide at run time would call zero. And there is no unsafe involved.

If you do not want git2::Oid::zero to take an argument for backwards compatibility, then you could have it return a SHA-1 object identifier always, and mark it deprecated in favor of these constants and a new zero_with_object_format (if I have not missed a reason why the constants cannot be used).

For oid.rs I also had the gut feeling that some unsafe could be removed, but I do not have any constructive comments here, sorry.

@lorenzleutgeb
Copy link
Copy Markdown

lorenzleutgeb commented Apr 25, 2026

Another comment regarding FromStr: I understand you opted to remove it to "avoid misuse", but I would argue that having FromStr actually is quite convenient. Instead of removing it, you could consider implementing FromStr in a way that it checks the length of the argument and then calls into the more specialized function that takes a string and an object format. The caller can then still inspect the object format of the result. You leave all options open for your users, and keep backwards compatibility.

Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some thoughts, not going to pretend that I understand this fully

View changes since this review

Comment thread src/index.rs Outdated
Comment thread src/odb.rs Outdated
Comment thread src/oid.rs Outdated
Comment thread src/oid.rs
#[cfg(not(feature = "unstable-sha256"))]
{
if bytes.len() != raw::GIT_OID_SHA1_SIZE {
return Err(Error::from_str("raw byte array must be 20 bytes"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a clearer error for trying to accidentally use 32 bytes when the feature isn't enabled?

Also, why have different ways of creating the SHA1 oid depending on if the SHA256 feature is available?

let oid_type = match bytes.len() {
    raw::GIT_OID_SHA1_SIZE => raw::GIT_OID_SHA1,
    #[cfg(feature = "unstable-sha256")]
    raw::GIT_OID_SHA256_SIZE => raw::GIT_OID_SHA256,
    #[cfg(not(feature = "unstable-sha256"))]
    raw::GIT_OID_SHA256_SIZE => {
        return Err(Error::from_str(
            "raw byte array of 32 bytes requires unstable-sha256 feature to be enabled",
        ))
    }
    _ => {
        return Err(Error::from_str(
            "raw byte array must be 20 bytes (SHA1) or 32 bytes (SHA256, if enabled)",
        ))
    }
};
unsafe {
    try_call!(raw::git_oid_from_raw(&mut raw, bytes.as_ptr(), oid_type));
}
Ok(Oid { raw })

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, though I am not sure if here we should talk about unstable-sha256 feature, downstream consumer of git2-rs may just propagate error message to end-users, and that might be weird to see that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the error message is propagated to end users, presumably the end users had some kind of say in the input value

Comment thread src/oid.rs Outdated
Comment thread src/remote.rs Outdated
}
#[cfg(not(feature = "unstable-sha256"))]
{
let _ = oid_type;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, even if the git2-rs bindings don't have SHA256 enabled, if the oid_type corresponds to SHA256 we should say that and probably return an error, rather than just claiming that the repository is formatted with SHA1?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, without unstable-sha256, the underlying libgit2-sys can't return a SHA256 oid or even open a SHA256 repo.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the underlying library was built with sha256 enabled, but the git2 crate wasn't?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK none of the mainstream packagers distribute libgit2.so with SHA256 support. The reason being it was broken before libgit2 1.9.3 (released a week ago). Also when sha256 is enabled, libgit2 built a shared libray with -experimental suffix. Unless packager intentionally patch that build script to remove the suffix, it is pretty distinguishable. However, it is unlikely someone does that, since the two are ABI incompatible and doing that breaks other programs dynamically linking to it.

I also raised this concern in rust-lang/cargo#16939 (comment) but from the other angle – Cargo may need to build from source always as nobody ships libgit2-experimental.so.

@rustbot

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-author Status: Waiting on PR author label May 5, 2026
@weihanglo
Copy link
Copy Markdown
Member Author

weihanglo commented May 7, 2026

Another comment regarding FromStr: I understand you opted to remove it to "avoid misuse", but I would argue that having FromStr actually is quite convenient. Instead of removing it, you could consider implementing FromStr in a way that it checks the length of the argument and then calls into the more specialized function that takes a string and an object format. The caller can then still inspect the object format of the result. You leave all options open for your users, and keep backwards compatibility.

This is still ambiguous for those not 41-64 char hex strings. I'd still consider deprecating it and eventually remove.
Alternatively, we just keep it there and have a compatibility note.

@weihanglo weihanglo force-pushed the sha256-git2 branch 2 times, most recently from 9b1f12d to 5778238 Compare May 8, 2026 00:03
Copy link
Copy Markdown
Member Author

@weihanglo weihanglo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New revision is up.

The major difference is that we stop follow libgit2@main but 1.9.3. The other is API design, which is stated in the PR description:

Introduce ObjectFormat accessors and *_ext API variants unconditionally,
The actual SHA256 wiring are gated behind the unstable-sha256 Cargo feature
in order to keep public API additive.

The *_ext suffix follows libgit2's convention for extended function variants.
At next major version, the original methods may absorb the format parameter.
The actual change will leave for future.

View changes since this review

@weihanglo weihanglo marked this pull request as ready for review May 8, 2026 00:27
@rustbot rustbot added the S-waiting-on-review Status: Waiting on review label May 8, 2026
@weihanglo
Copy link
Copy Markdown
Member Author

cc @ehuss

@weihanglo weihanglo removed the S-waiting-on-author Status: Waiting on PR author label May 8, 2026
@weihanglo weihanglo changed the title [WIP] feat: unstable SHA256 support feat: unstable SHA256 support May 8, 2026
Copy link
Copy Markdown

@lorenzleutgeb lorenzleutgeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will double down on what I explained in #1206 (comment): If the feature unstable-sha256 is enabled, the implementation of the unsafe function crate::util::zeroed_raw_oid is incorrect. It allows to create OIDs in memory, which are invalid. Their kind of value zero does not correspond to a known git2::ObjectFormat and also neither to libgit2's GIT_OID_SHA1 nor GIT_OID_SHA256. That is because the function uses std::mem::zeroed but the object formats are non-zero bytes. This is a huge footgun. Please consider these options:

  1. Remove crate::util::zeroed_raw_oid.
  2. Remove the unsafe keyword from crate::util::zeroed_raw_oid and reimplement it without unsafe blocks. This will force you to add an argument format: git2::ObjectFormat, otherwise, if unstable-sha256 is enabled, you will not be able to construct a legal zero OID.
  3. Use #[cfg(not(feature = "unstable-sha256"))] crate::util::zeroed_raw_oid.

View changes since this review

Comment thread src/oid.rs
Sha1,
/// SHA256 object format (32-byte object IDs)
#[cfg(feature = "unstable-sha256")]
Sha256,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using an explicit discriminant that matches with libgit2. Like

Suggested change
Sha256,
Sha256 = raw::GIT_OID_SHA256,

or

Suggested change
Sha256,
Sha256 = 2,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. Some of the pre-existing translation enums in git2-rs follow the plain enum + Binding + explicit match pattern. libgit2_sys::git_oid_t is a numeric type alias, and ObjectFormat is the safe abstraction over it. Explicit discriminants would definitly simplify ObjectFormat::raw(), though from_raw still needs the match for safety against invalid C values. I'd keep ObjectFormat as-is for now.

Comment thread src/oid.rs Outdated

/// Object ID format (hash algorithm).
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
#[non_exhaustive]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have been more explicit when I suggested this in #1090 (comment), sorry.

Adding #[non_exhaustive] breaks downstream crates, since their matches will fail to compile. Thus I think this is a breaking change. This might be completely fine, as this crate is 0.x anyway, but I think it's worth noting. That being said...

Adding #[non_exhaustive] even when unstable-sha256 is enabled, might be "too much". This way you are preparing your users for whatever comes after SHA-256. Which might be your intention, and then that's fine. But you might as well consider that it might take another decade or two until that happens, and you could spare your dependents the _ => unimplemented!("...") arm. This would mean that adding the next object format is a breaking change. And that might be fine as well.

So one approach might be to have

Suggested change
#[non_exhaustive]
#[cfg_attr(not(feature = "unstable-sha256"), non_exhaustive)]

now, and whenever SHA-256 becomes stable, to remove non_exhaustive.

This way, all dependents not enabling unstable-sha256 will be forced to think about further object formats (which might make sense, since SHA-256 is already out in the wild), and everyone that opts in to unstable-sha256 can enjoy that this enum is exhaustive. The migration (as SHA-256 stabilizes) also works out nicely: The _ => unimplemented!("...") arm nicely takes the spot of the Sha256 variant, but since now the non_exhaustive attribute would be removed no third arm is required once the dependent decides to implement support for SHA-256.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then this is probably not worthy. We will anyway have another major version bump when stabilizing sha256 feature.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think about it more, yeah this sounds good.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and squashed into 17d0f51. Thanks again for the forward-looking suggestion!

Comment thread src/oid.rs
#[non_exhaustive]
pub enum ObjectFormat {
/// SHA1 object format (20-byte object IDs)
Sha1,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Sha1,
Sha1 = raw::GIT_OID_SHA1,

or

Suggested change
Sha1,
Sha1 = 1,

(See https://github.com/rust-lang/git2-rs/pull/1206/changes#r3207277482.)

Comment thread src/oid.rs
@@ -65,36 +106,78 @@ impl Oid {
pub fn from_bytes(bytes: &[u8]) -> Result<Oid, Error> {
crate::init();
let mut raw = crate::util::zeroed_raw_oid();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If unstable-sha256 is enabled, raw.kind will be zero and thus not a valid object format!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lorenzleutgeb commented in #1206 (review)

(inline comment is better for discussion)

I will double down on what I explained in #1206 (comment): If the feature unstable-sha256 is enabled, the implementation of the unsafe function crate::util::zeroed_raw_oid is incorrect. It allows to create OIDs in memory, which are invalid. Their kind of value zero does not correspond to a known git2::ObjectFormat and also neither to libgit2's GIT_OID_SHA1 nor GIT_OID_SHA256. That is because the function uses std::mem::zeroed but the object formats are non-zero bytes. This is a huge footgun. Please consider these options:

  1. Remove crate::util::zeroed_raw_oid.
  2. Remove the unsafe keyword from crate::util::zeroed_raw_oid and reimplement it without unsafe blocks. This will force you to add an argument format: git2::ObjectFormat, otherwise, if unstable-sha256 is enabled, you will not be able to construct a legal zero OID.
  3. Use #[cfg(not(feature = "unstable-sha256"))] crate::util::zeroed_raw_oid.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed follow-up. Will do a cleanup then, though let me clarify the current state:

  • Oid::zero() is now deprecated and returns Oid::ZERO_SHA1 always. The reasoning being high-level binding should not have this kind of zeroed struct as output param. If there is any probably an API design flaw. If people want oid placeholder, they probalby already know the object format so can choice either const. Tthanks for the const suggestion!
  • zeroed_raw_oid() is pub(crate). Every internal call site uses it as an output parameter that libgit2 overwrites before any read. Agreed it may become an internal footgun if misused.
  • FWIW, libgit2 itself uses uninitialized git_oid as output buffers in lots of places. Like maybe_sha_or_abbrev: git_oid oid; is declared uninitialized, then filled by git_oid__fromstrn(&oid, ...) before any read.

Though thanks again, removing one more unsafe is a good thing! Fixed in 8c17e75

The only one concern is that if libgit2 somehow start checking kind for output parameter cases, it might become buggy. Though that will be considered an upstream libgit2 bug.

weihanglo added 8 commits May 8, 2026 08:25
This is a preparation of SHA256 support

* Added `Oid::object_format()`
* Added `Oid::raw_bytes()`
* Changed `Oid::from_bytes()` to use `GIT_OID_SHA1_SIZE`
* Added `Repository::object_format()`
* Added `Remote::object_format()`
* Added `RepositoryInitOptions::object_format()`
Add `*_ext` method variants that accept an `ObjectFormat` param.
The short-hand methods now delegate to `_ext` variants with Sha1.

* Added `Oid::from_str_ext`
* Added `Oid::hash_object_ext`
* Added `Oid::hash_file_ext`
* Added `Diff::from_buffer_ext`
* Added `Index::new_ext`
* Added `Index::open_ext`
* Added `Indexer::new_ext`
* Added `Odb::new_ext`
This makes these call sites automatically handle
the correct object format once SHA256 support is wired up.
Declare the `unstable-sha256` Cargo feature,
reflecting upstream libgit2's `GIT_EXPERIMENTAL_SHA256`.
This is an ABI-breaking experimental feature.

Implement the SHA256 support gated behind this feature:

Things gated:

* `ObjectFormat::Sha256` and non_exhaustive when `unstable-sha256` is off
* all FFI call sites for the overloaded libgit2 function signatures
  * git_oid_fromstrn
  * git_oid_fromraw,
  * git_odb_hash
  * git_odb_hashfile
  * git_diff_from_buffer,
  * git_index_new
  * git_index_open
  * git_indexer_new
  * git_odb_new
* `Repository::object_format` is format-aware
* `Remote::object_format` is format-aware
* `Oid::{as_bytes,object_format,from_bytes}` are format-aware
`Oid::zero().object_format()` panics with `unstable-sha256`
because the raw `kind` field was left as zero.

The zeroed struct is generally use for output param.
Not really meaning to switch a `zero_ext` variant.

Replace it with const values for both kinds.
Construct `git_oid` with a struct literal instead of `mem::zeroed`.

Under `unstable-sha256`,
default `kind` to `GIT_OID_SHA1` so the value is a valid `git_oid_t`
even before libgit2 overwrites it as an output buffer.
@rustbot
Copy link
Copy Markdown
Collaborator

rustbot commented May 8, 2026

This PR was rebased onto a different main commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

Comment thread src/oid.rs
/// Object ID format (hash algorithm).
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
#[cfg_attr(not(feature = "unstable-sha256"), non_exhaustive)]
pub enum ObjectFormat {
Copy link
Copy Markdown

@lorenzleutgeb lorenzleutgeb May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downstream I just attempted to print an ObjectFormat and realized that it does not impl Display. Perhaps that might be a useful addition?

View changes since the review

Comment thread examples/init.rs
/// permissions to create the repository with
flag_shared: Option<String>,
#[structopt(name = "object-format", long, value_parser = parse_object_format)]
/// object format to use (sha1 or sha256, requires unstable-sha256 feature)
Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// object format to use (sha1 or sha256, requires unstable-sha256 feature)
/// object format to use (sha1 or sha256, requires unstable-sha256 feature to use the sha256 format)

View changes since the review

Comment thread src/diff.rs
/// This parses the diff assuming SHA1 object IDs. Use
/// [`Diff::from_buffer_ext`] to specify a different format.
pub fn from_buffer(buffer: &[u8]) -> Result<Diff<'static>, Error> {
Self::from_buffer_ext(buffer, crate::ObjectFormat::Sha1)
Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't make suggestions on the unmodified line, but I suggest adding an import for crate::ObjectFormat rather than needing to fully qualify it

View changes since the review

Comment thread src/diff.rs
let data = buffer.as_ptr() as *const c_char;
let len = buffer.len();
unsafe {
// NOTE: Doesn't depend on repo, so lifetime can be 'static
Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this explanation of why things can be 'static is lost

View changes since the review

Comment thread src/index.rs
/// This always creates a SHA1 index.
/// Use [`Index::new_ext`] to create an index with a specific object format.
pub fn new() -> Result<Index, Error> {
Self::new_ext(crate::ObjectFormat::Sha1)
Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same suggestion to add a use statement for the ObjectFormat, also applies in some other files, not going to keep repeating the suggestion

View changes since the review

Comment thread src/oid.rs
/// Parses a hex-formatted object id
/// with a specific object format.
///
/// See [`Oid::from_str`] for more details.
Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except ::from_str documents that it is an error if the string is longer than 40 hex characters, and presumably for this new method the maximum length is format-dependent, so this method should have some extra documentation

View changes since the review

Comment thread src/oid.rs
#[cfg(not(feature = "unstable-sha256"))]
{
if bytes.len() != raw::GIT_OID_SHA1_SIZE {
return Err(Error::from_str("raw byte array must be 20 bytes"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the error message is propagated to end users, presumably the end users had some kind of say in the input value

Comment thread src/oid.rs
Ok(Oid { raw: out })
}

/// View this OID as a byte-slice 20 bytes in length.
Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we document that this is going to truncate sha256 ids?

View changes since the review

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand, why would this truncate?

I would agree that the doc comment should probably be updated, though, to mention it can be 32 bytes for sha256.

Comment thread src/oid.rs

/// View the full underlying byte buffer of this OID.
///
/// Currently equivalent to [`Oid::as_bytes`].
Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bin confusing - the "Currently equivalent" assumes that the feature is not enabled, but then the next line discusses that it could be enabled. Maybe

Suggested change
/// Currently equivalent to [`Oid::as_bytes`].
/// For SHA1 hashes, this is equivalent to [`Oid::as_bytes`].

View changes since the review

Comment thread src/oid.rs
use tempfile::TempDir;

#[test]
#[cfg(not(feature = "unstable-sha256"))]
Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I noticed this on a few other files too - why are various tests disabled when using unstable-sha256? If they would be failing, then that is indicative of a BC break, and if they would be passing, then we should presumably keep the tests enabled?

View changes since the review

Comment thread src/oid.rs

// SHA1 OID comparisons with explicit format
assert_eq!(
Oid::from_str_ext("decbf2b", ObjectFormat::Sha1)?,
Copy link
Copy Markdown
Contributor

@DanielEScherzer DanielEScherzer May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, for the *_ext methods when using Sha1, those parts can also be tested even when sha256 support is not enabled

View changes since the review

@ehuss
Copy link
Copy Markdown
Contributor

ehuss commented May 11, 2026

Note: Git 3 will switch to reftable by default (source). Reftable support was only just merged in libgit2/libgit2#7117 which presumably will be in libgit2 2.0. SHA256 support is a major step towards being compatible with git3, but reftable will be another large piece that will be required. Fortunately I don't think that requires any API changes, though we should probably plan to do some testing ahead of the libgit2 2.0 release.

- run: cargo test --features https,ssh
- run: cargo run -p systest
- run: cargo run -p systest --features unstable-sha256
- run: cargo test --locked
Copy link
Copy Markdown
Contributor

@ehuss ehuss May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving these breaks the --locked check since the earlier cargo run will update the lockfile. Can you either move them back, or add --locked to the other commands?

View changes since the review

- run: cargo run -p systest --features unstable-sha256
- run: cargo test --locked
- run: cargo test --features https,ssh
- run: cargo test --features unstable-sha256
Copy link
Copy Markdown
Contributor

@ehuss ehuss May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also make sure this covers networking (and examples)? Maybe something like:

Suggested change
- run: cargo test --features unstable-sha256
- run: cargo test --features https,ssh,unstable-sha256

View changes since the review

Comment thread src/index.rs
Comment on lines +1051 to +1054
#[cfg(not(feature = "unstable-sha256"))]
id: Oid::from_bytes(&[0; 20]).unwrap(),
#[cfg(feature = "unstable-sha256")]
id: Oid::from_bytes(&[0; 32]).unwrap(),
Copy link
Copy Markdown
Contributor

@ehuss ehuss May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion, take it or leave it:

Suggested change
#[cfg(not(feature = "unstable-sha256"))]
id: Oid::from_bytes(&[0; 20]).unwrap(),
#[cfg(feature = "unstable-sha256")]
id: Oid::from_bytes(&[0; 32]).unwrap(),
#[cfg(not(feature = "unstable-sha256"))]
id: Oid::ZERO_SHA1,
#[cfg(feature = "unstable-sha256")]
id: Oid::ZERO_SHA256,

View changes since the review

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested these constants, so that will surprise no one, but I second that suggestion. More readable and less unwraping!

Comment thread src/index.rs

#[test]
#[cfg(feature = "unstable-sha256")]
fn smooke_in_memory_index_sha256() {
Copy link
Copy Markdown
Contributor

@ehuss ehuss May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn smooke_in_memory_index_sha256() {
fn smoke_in_memory_index_sha256() {

View changes since the review

Comment thread src/oid.rs
Ok(Oid { raw: out })
}

/// View this OID as a byte-slice 20 bytes in length.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand, why would this truncate?

I would agree that the doc comment should probably be updated, though, to mention it can be 32 bytes for sha256.

Comment thread src/oid.rs
Comment on lines +250 to +251
/// With the `unstable-sha256` feature, these two methods will differ.
pub fn raw_bytes(&self) -> &[u8] {
Copy link
Copy Markdown
Contributor

@ehuss ehuss May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this method added? I don't see it used anywhere.

View changes since the review

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread src/test.rs
Comment on lines +40 to +41
#[cfg(feature = "unstable-sha256")]
pub fn repo_init_sha256() -> (TempDir, Repository) {
Copy link
Copy Markdown
Contributor

@ehuss ehuss May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of duplicating the code here, can we merge them somehow?

I might suggest having a repo_init_ext that takes an ObjectFormat as a parameter, and rewrite repo_init to just call repo_init_ext(ObjectFormat::Sha1).

Or any other approach, just try to remove the duplicate code.

View changes since the review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Waiting on review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants