0

I want to use xmlstarlet from the powershell started with Process in a C# application. My main problem is that when I use this code:

./xml.exe ed -N ns=http://www.w3.org/2006/04/ttaf1 -d '//ns:div[not(contains(@xml:lang,''Italian''))]' "C:\Users\1H144708H\Downloads\a.mul.ttml" > "C:\Users\1H144708H\Downloads\a.mul.ttml.conv"

on powershell I get a file with the wrong encoding (I need UTF-8).

On Bash I used to just

export LANG=it_IT.UTF-8 && 

before xmlstarlet but on powershell I really don't know how to do it. Maybe there is an alternative, I saw that xmlstarlet is able to use sel --encoding utf-8 but I don't know how to use it in ed mode (I tried to use it after xml.exe after ed etc... but it always fail).

What is the alternative to export LANG=it_IT.UTF-8 or how to use --encoding utf-8?

PS. I tried many and many things like:

$MyFile = Get-Content "C:\Users\1H144708H\Downloads\a.mul.ttml"; $Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False; [System.IO.File]::WriteAllLines("C:\Users\1H144708H\Downloads\a.mul.ttml.conv", $MyFile, $Utf8NoBomEncoding)

And:

./xml.exe ed -N ns=http://www.w3.org/2006/04/ttaf1 -d '//ns:div[not(contains(@xml:lang,''Italian''))]' "C:\Users\1H144708H\Downloads\a.mul.ttml" |  Out-File "C:\Users\1H144708H\Downloads\a.mul.ttml.conv" -Encoding utf8

But characters like è à ì ù are still wrong. If I try to save the original file with Notepad before the conversion it works (only if I don't use xmlstarlet)... but I need to do the same thing in powershell and I don't know how.

EDIT. I was able to print my utf8 on powershell:

Get-Content -Path "C:\Users\1H144708H\Downloads\a.mul.ttml" -Encoding UTF8 

But I'm still not able to do the same thing with xmlstarlet.

0

1 Answer 1

0

In the end I decided to create a native C# method and I just used a StreamReader to ReadLine by line the file. With a simple Contains I decide where is the xml:lang="Language" and I then start to add every line to a string. Of course I added the head and the end of my file before the while loop and I stop to add every line when I read a line that Contains . I know that this is not the best way to do things, but it works for my case.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.